Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes (SIGABRT) Under Test Automation #539

Open
clintmiller opened this issue Sep 27, 2023 · 15 comments
Open

Crashes (SIGABRT) Under Test Automation #539

clintmiller opened this issue Sep 27, 2023 · 15 comments

Comments

@clintmiller
Copy link

clintmiller commented Sep 27, 2023

Environment

Operating System

Intel Mac, x86_64

ProductName:		macOS
ProductVersion:		13.4.1
ProductVersionExtra:	(c)
BuildVersion:		22F770820d

TinyTDS Version and Information

[TinyTds][v2.1.5][tsql]: /usr/local/bin/tsql
Compile-time settings (established with the "configure" script)
                            Version: freetds v1.3.20
             freetds.conf directory: /usr/local/etc
     MS db-lib source compatibility: no
        Sybase binary compatibility: yes
                      Thread safety: yes
                      iconv library: yes
                        TDS version: 7.3
                              iODBC: no
                           unixodbc: yes
              SSPI "trusted" logins: no
                           Kerberos: yes
                            OpenSSL: yes
                             GnuTLS: no
                               MARS: yes

I'm actually pointing at a post-2.1.5 commit on tiny_tds: 23ed1e4, where #527 was merged.

FreeTDS Version

freetds: 1.3.20

SQL Server

Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication.

Description

In the course of trying to upgrade an application from Ruby 2.5/Rails 5.2.x to Ruby 3.1/Rails 6.1.x, I have noticed a significant increase in automated test failures (we're using rspec-rails). These failures seem take a couple of forms:

1. An EXC_BAD_ACCESS (SIGABRT).

Terminal output:

<big long memory region dump trimmed>

Abort trap: 6

OS Crash Report:

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	    0x7ff8092e61f2 __pthread_kill + 10
1   libsystem_pthread.dylib       	    0x7ff80931dee6 pthread_kill + 263
2   libsystem_c.dylib             	    0x7ff809244b45 abort + 123
3   libruby.3.1.dylib             	       0x106e845e9 die + 9
4   libruby.3.1.dylib             	       0x106e84810 rb_bug_for_fatal_signal + 544
5   libruby.3.1.dylib             	       0x106fb475b sigsegv + 91
6   libsystem_platform.dylib      	    0x7ff80934b5ed _sigtramp + 29
7   ???                           	               0x0 ???
8   tiny_tds.bundle               	       0x10c34543e nogvl_dbsqlok + 88 (result.c:130) [inlined]
9   tiny_tds.bundle               	       0x10c34543e rb_tinytds_result_ok_helper + 109 (result.c:179) [inlined]
10  tiny_tds.bundle               	       0x10c34543e rb_tinytds_result_exec_helper + 126 (result.c:185)
11  tiny_tds.bundle               	       0x10c344528 rb_tinytds_result_do + 56 (result.c:528)
12  libruby.3.1.dylib             	       0x10704864d vm_call_cfunc_with_frame + 349
<snip>
Thread 1:
0   libsystem_kernel.dylib        	    0x7ff8092e629e poll + 10
1   libruby.3.1.dylib             	       0x1070004ac timer_pthread_fn + 140
2   libsystem_pthread.dylib       	    0x7ff80931e1d3 _pthread_start + 125
3   libsystem_pthread.dylib       	    0x7ff809319bd3 thread_start + 15

Thread 2:: connection_pool.rb:323
0   libsystem_kernel.dylib        	    0x7ff8092e629e poll + 10
1   libruby.3.1.dylib             	       0x106ff4acd rb_sigwait_sleep + 557
2   libruby.3.1.dylib             	       0x106ff5fb4 native_sleep + 500
3   libruby.3.1.dylib             	       0x106ff6b9b sleep_hrtime + 331
4   libruby.3.1.dylib             	       0x106f5f475 rb_f_sleep + 85
5   libruby.3.1.dylib             	       0x10704864d vm_call_cfunc_with_frame + 349
<snip>
Thread 3:: worker-1
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x106fffde5 gvl_acquire_common + 309
3   libruby.3.1.dylib             	       0x106ff7648 blocking_region_end + 216
4   libruby.3.1.dylib             	       0x106ff716f rb_nogvl + 175
5   tiny_tds.bundle               	       0x10c345433 nogvl_dbsqlok + 77 (result.c:129) [inlined]
6   tiny_tds.bundle               	       0x10c345433 rb_tinytds_result_ok_helper + 98 (result.c:179) [inlined]
7   tiny_tds.bundle               	       0x10c345433 rb_tinytds_result_exec_helper + 115 (result.c:185)
8   tiny_tds.bundle               	       0x10c344528 rb_tinytds_result_do + 56 (result.c:528)
9   libruby.3.1.dylib             	       0x10704864d vm_call_cfunc_with_frame + 349
<snip>

I'm pretty sure this was a simple unit test, given the running threads in the Ruby process. One thing that immediately catches my eye is that there appear to be two threads in the TinyTDS C code at the same time. I'm also curious about the nogvl_* functions, but only b/c of their name. I'm not sure I fully understand their purpose.

2. An EXC_CRASH (SIGABRT).

Terminal output:

/Users/clintmiller/.asdf/installs/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activerecord-sqlserver-adapter-6.1.3.0/lib/active_record/connection_adapters/sqlserver/database_statements.rb:463: warning: TinyTds: dbsqlsend() returned FAIL.

Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.

OS Crash Report:

Thread 0::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	    0x7ff8092e629e poll + 10
1   libsybdb.5.dylib              	       0x1101c3ba6 tds_select + 221
2   libsybdb.5.dylib              	       0x1101c8419 tds_connection_network + 110
3   libsybdb.5.dylib              	       0x1101c8212 tds_read_packet + 235
4   libsybdb.5.dylib              	       0x1101afb94 tds_get_byte + 25
5   libsybdb.5.dylib              	       0x1101a8acf tds_process_tokens + 183
6   libsybdb.5.dylib              	       0x1101953e5 dbsqlok + 175
7   libruby.3.1.dylib             	       0x10ad8b158 rb_nogvl + 152
8   tiny_tds.bundle               	       0x11011c433 nogvl_dbsqlok + 77 (result.c:129) [inlined]
9   tiny_tds.bundle               	       0x11011c433 rb_tinytds_result_ok_helper + 98 (result.c:179) [inlined]
10  tiny_tds.bundle               	       0x11011c433 rb_tinytds_result_exec_helper + 115 (result.c:185)
11  tiny_tds.bundle               	       0x11011b528 rb_tinytds_result_do + 56 (result.c:528)
12  libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 1:
0   libsystem_kernel.dylib        	    0x7ff8092e629e poll + 10
1   libruby.3.1.dylib             	       0x10ad944ac timer_pthread_fn + 140
2   libsystem_pthread.dylib       	    0x7ff80931e1d3 _pthread_start + 125
3   libsystem_pthread.dylib       	    0x7ff809319bd3 thread_start + 15

Thread 2:: connection_pool.rb:323
0   libsystem_kernel.dylib        	    0x7ff8092e629e poll + 10
1   libruby.3.1.dylib             	       0x10ad88acd rb_sigwait_sleep + 557
2   libruby.3.1.dylib             	       0x10ad89fb4 native_sleep + 500
3   libruby.3.1.dylib             	       0x10ad8ab9b sleep_hrtime + 331
4   libruby.3.1.dylib             	       0x10acf3475 rb_f_sleep + 85
5   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 3:: Timeout stdlib thread
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 4:: server.rb:76
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad94276 thread_join_sleep + 230
5   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
6   libruby.3.1.dylib             	       0x10ad940bd thread_join + 157
7   libruby.3.1.dylib             	       0x10ad90d6e thread_join_m + 238
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 5:: puma reactor
0   libsystem_kernel.dylib        	    0x7ff8092e41ee kevent + 10
1   nio4r_ext.bundle              	       0x1191c6c69 kqueue_poll + 281 (ev_kqueue.c:102)
2   nio4r_ext.bundle              	       0x1191c3f55 ev_backend_poll + 21 (ev.c:4026)
3   libruby.3.1.dylib             	       0x10ad8b158 rb_nogvl + 152
4   nio4r_ext.bundle              	       0x1191c33b0 ev_run + 768 (ev.c:4216)
5   nio4r_ext.bundle              	       0x1191c902a NIO_Selector_run + 86 (selector.c:476) [inlined]
6   nio4r_ext.bundle              	       0x1191c902a NIO_Selector_select_synchronized + 186 (selector.c:431)
7   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
8   nio4r_ext.bundle              	       0x1191c8775 NIO_Selector_select + 101 (selector.c:409)
9   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 6:: puma threadpool reaper
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92bd7 native_cond_sleep + 455
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8ab9b sleep_hrtime + 331
5   libruby.3.1.dylib             	       0x10acf3475 rb_f_sleep + 85
6   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 7:: puma threadpool trimmer
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92bd7 native_cond_sleep + 455
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8ab9b sleep_hrtime + 331
5   libruby.3.1.dylib             	       0x10acf3475 rb_f_sleep + 85
6   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 8:: puma server
0   libsystem_kernel.dylib        	    0x7ff8092e8282 __select + 10
1   libruby.3.1.dylib             	       0x10ad8d8b9 do_select + 425
2   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
3   libruby.3.1.dylib             	       0x10ad8d5ca rb_thread_fd_select + 1178
4   libruby.3.1.dylib             	       0x10ac6b5dd select_call + 1293
5   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
6   libruby.3.1.dylib             	       0x10ac5eb6c rb_f_select + 220
7   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 9:: worker.rb:96
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 10:: worker-1
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 11:: worker-2
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 12:: worker-1
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 13:: worker-1
0   libsystem_kernel.dylib        	    0x7ff8092e20ee __psynch_cvwait + 10
1   libsystem_pthread.dylib       	    0x7ff80931e758 _pthread_cond_wait + 1242
2   libruby.3.1.dylib             	       0x10ad92c06 native_cond_sleep + 502
3   libruby.3.1.dylib             	       0x10ad89e30 native_sleep + 112
4   libruby.3.1.dylib             	       0x10ad8a8b9 sleep_forever + 457
5   libruby.3.1.dylib             	       0x10ad95c17 queue_sleep + 39
6   libruby.3.1.dylib             	       0x10ac24ece rb_ensure + 350
7   libruby.3.1.dylib             	       0x10ad95b62 queue_do_pop + 226
8   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>
Thread 14 Crashed:: worker-2
0   libsystem_kernel.dylib        	    0x7ff8092e61f2 __pthread_kill + 10
1   libsystem_pthread.dylib       	    0x7ff80931dee6 pthread_kill + 263
2   libsystem_c.dylib             	    0x7ff809244b45 abort + 123
3   libsystem_c.dylib             	    0x7ff809243e5e __assert_rtn + 314
4   libsybdb.5.dylib              	       0x1101ce32f tds_free_connection.cold.1 + 35
5   libsybdb.5.dylib              	       0x1101a6509 tds_free_connection + 368
6   libsybdb.5.dylib              	       0x1101a671f tds_free_socket + 215
7   libsybdb.5.dylib              	       0x110194a27 dbclose + 239
8   tiny_tds.bundle               	       0x1101197af rb_tinytds_close + 63 (client.c:266)
9   libruby.3.1.dylib             	       0x10addc64d vm_call_cfunc_with_frame + 349
<snip>

Obviously, this case with more threads is a system/integration test that started a webserver.

Other Notes

It's interesting to note, that the application boots and runs in my development environment w/o any issues. I'm not sure how it will do with a production workload given the crashes I'm seeing in our test suite.

Next Steps

Making a minimal reproduction test-case would be useful, but I'm not quite sure where to start. On Ruby 2.5/Rails 5.2, our test suite rarely (if ever), crashed in this manner. Is there any other additional information I can provide or debug builds of TinyTDS I could use to reveal more helpful information?

@andyundso
Copy link
Member

Hi, couple of questions:

  • Against which SQL server do you run the test suite? What kind of of authentication do you run?
  • The crash reports seem from a Mac system. With test automation, do you mean just the test suite of your project or do you run automated tests against Mac on a continuous integration system, like GitHub Actions?
  • If you update your tiny_tds to the mentioned commit 23ed1e4 before you did the Ruby and Rails upgrade, can you reproduce these issues?
  • Do you run ARM or Intel on your Mac?

@clintmiller
Copy link
Author

Updated above w/ answers- also copied here for convenience, thank you @andyundso.

Against which SQL server do you run the test suite? What kind of of authentication do you run?

Running against 2017 for Linux in a Docker container via Docker Desktop for Mac. SQL Server Authentication. I don't think I'm having an authentication problem as my app (and tests) are able to connect to the database and execute queries.

The crash reports seem from a Mac system. With test automation, do you mean just the test suite of your project or do you run automated tests against Mac on a continuous integration system, like GitHub Actions?

By "test automation", I simply meant the execution of the test suite for this project. As part of this upgrade process, I will get the suite running in an automated CI system, either w/ updates to our existing, but dated Jenkins CI installation, or in GitHub Actions. I have not done this yet.

If you update your tiny_tds to the mentioned commit 23ed1e4 before you did the Ruby and Rails upgrade, can you reproduce these issues?

I will try this against our Rails 5.2 branch, assuming the newer TinyTDS is backwards compatible both w/ Rails 5.2 (should be, I think), and Ruby 2.5.x (??).

Do you run ARM or Intel on your Mac?

This is on an Intel Mac.

@andyundso
Copy link
Member

Disclaimer: I'm unfamiliar with the C code in tiny_tds, just trying to help out :).

I also own an Intel Mac, the same OS version as yours, the same FreeTDS version. I maintain two Rails projects, both with RSpec, but one runs Ruby 3.0, the other 3.1. I checked if I also ran into these issues when using the version of tiny_tds you mentioned in your issue and the SQL server 2017 in Docker. I have run about ten runs on both apps so far with no problems. Are your crashes more infrequent than every tenth run?

In general, from what I read in the code, the nogvl stuff executes code on a separate Ruby thread but releases the Global VM Lock, as we no longer execute Ruby code, but instead talk with FreeTDS, who then talks to MSSQL. Once the function finishes, it re-acquires the GVL.

EXC_BAD_ACCESS means we write memory somewhere where we should not. EXC_CRASH reacts to a SIGABRT, likely raised of freetds because of the assertion error that it hits.

Is there some consistency when you hit these errors? For example, I assume you run your test suite in random order. Once it crashes and you re-run the test suite with the same seed, does it crash again with the same error?

As mentioned, it would be interesting to know if you also hit this error when updating tiny_tds, but run the Ruby 2.5 / Rails 5.2 version of your app. tiny_tds should work with Ruby 2.5, at least our CI is green :)

@bf4
Copy link
Contributor

bf4 commented Oct 11, 2023

similar error when in rails console on Ruby 3.1.2, Rails 7.0, TinyTDS 2.1.5, freetds v1.3.20 on M1 mac running against various MS Sql Server servers e.g. Microsoft SQL Server 2016 (SP3-CU1-GDR) (KB5021128) - 13.0.7024.30 (X64)

usually after I've made a bunch of queries and have had the console open a while, but only on exit

Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6

@clintmiller
Copy link
Author

I was able to reliably trigger this with a single spec example and it's beginning to form some type of working theory for me. This is an excerpt from a Rails controller spec:

  describe 'GET finish' do
    let(:product_download) { create(:product_download) }
    let(:params) do
      { membership: product_download.membership_id, log: product_download.id, bytes: 123456 }
    end

    it 'closes the product download log log' do
      get :finish, params: params
      updated_log = ProductDownload.find(product_download.id) # <= This is the boom!
      ...
    end
  end

Test output:

Igniter::DownloadController
  GET finish
/Users/clintmiller/.asdf/installs/ruby/3.1.4/lib/ruby/gems/3.1.0/gems/activerecord-sqlserver-adapter-6.1.3.0/lib/active_record/connection_adapters/sqlserver/database_statements.rb:463: warning: TinyTds: dbsqlsend() returned FAIL.

Assertion failed: (conn->in_net_tds == NULL), function tds_free_connection, file mem.c, line 1209.
Abort trap: 6

One other important detail is that my application is a multi-tenant app, where the current tenant is a per-request concept that can be switched based on criteria like the hostname requested. One of the things that tenant switching did* is switch the database connection with the Apartment gem, which we had* forked to support the activerecord-sqlserver-adapter.

Something must be happening lifetime of these database connections when this spec is run. The database connection being used by the Rails server handling the get :finish request is closed (the request is over after all?) but is then attempted to be used again by the following line? I'm not sure that's exactly it.

*Did and *had b/c we've actually removed Apartment and the tenant switching functionality that used it to change the database connections. The tenancy concept is still in place, and requests still switch tenants as before, but we aren't manipulating the database connections during the tenant switch any longer.

@bf4
Copy link
Contributor

bf4 commented Oct 18, 2023

@clintmiller have you tried reproducing bundling against main? there may be an unreleased fix

@andyundso
Copy link
Member

I was originally on the search for something else, but I managed to bring up the issue as well:

ruby: mem.c:1202: tds_free_connection: Assertion `conn->in_net_tds == NULL' failed.
Aborted (core dumped)
rake aborted!

I added this test to the tiny_tds test suite involving threads.

it 'raises error when query cannot be sent' do
  client = new_connection
  assert_client_works(client)
    
  thread1 = Thread.new do
    client.execute("WaitFor Delay '00:00:10'")
  end

  assert thread1.alive?    
  thread2 = Thread.new do
    assert_raises(TinyTds::Error) { client.execute("WaitFor Delay '00:00:02'") }
  end
    
  thread1.join
  thread2.join
 end

but from here it gets complicated - I would have to compile Ruby in Debug mode to get meaningful pointers when the coredump is generated. and even then I am wondering how my example test applies to your test suite failure.


while writing the comment I had another idea.

what works fine is the following:

it "is another thing" do
  client = new_connection
  client.execute("SELECT 1 as [one]")
  client.execute("SELECT 1 as [one]")
end

this properly generates an error:

test_0008_is another thing                                     ERROR (0.02s)
Minitest::UnexpectedError:         TinyTds::Error: Attempt to initiate a new Adaptive Server operation with results pending
            /home/andy/dev/andyundso/tiny_tds/test/client_test.rb:153:in `execute'
            /home/andy/dev/andyundso/tiny_tds/test/client_test.rb:153:in `block (2 levels) in <class:ClientTest>'

but this test code:

it "is another thing with wait" do
  client = new_connection
  client.execute("WaitFor Delay '00:00:10'")
  client.execute("SELECT 1 as [one]")
end

generates this error:

test_0008_is another thing with wait                           ERROR (16.05s)
Minitest::UnexpectedError:         TinyTds::Error: DBPROCESS is dead or not enabled
            /home/andy/dev/andyundso/tiny_tds/test/client_test.rb:158:in `execute'
            /home/andy/dev/andyundso/tiny_tds/test/client_test.rb:158:in `block (2 levels) in <class:ClientTest>'

158 is the second call to execute from the test example above.

I have no clue yet what this all means and if it is even a lead following, but I have to leave the office for today and needed to write down my exploration so far :)

@andyundso
Copy link
Member

had some more time today. bare with me, another long comment.

so my primary goal was to dissallow executing execute when there is still something pending. I thought the fix was quite easy:

if (cwrap->userdata->dbsql_sent && !dbdead(cwrap->client) && !cwrap->userdata->closed) {
  rb_raise(cTinyTdsError, "Trying to send another query before consuming results. Please call \"do\", \"each\" or \"insert\" first.");
}

basically, tiny_tds tracks internally if a query has been sent to the server with the dbsql_sent property. this property is set to 0 once the results have been consumed (or so I thought at least). this made my test pass which I posted above with trying to run execute back to back. but I had some other test failures.

first, do did not make any troubles. do returns the amount of rows processes. this happens with the dbcount function, which I assume does not hit the server anymore, but I did not test it.

insert runs another query against the server to find the last identity value generated in the current scope. it is basically the same cycle as when running something with execute: loading the command into the command buffer, sending it to the server and fetching the result. In theory, it could be possible that if two threads share the same TinyTds::Client, the command buffer gets written by both threads at the same time.

insert is a nice feature by tiny_tds, but I am note sure if it is really needed. at least the activerecord-sql-adapter does not seem to use it.

there was one testing failing on this line with the error message I introduced earlier:

id_constraint_name = @client.execute("EXEC sp_helpindex [datatypes]").find { |row| row['index_keys'] == 'id' }['index_name']

looking through the code, I assume find makes rb_yield abort earlier, so our each function is not executed in its entirety and therefore dbsql_sent is not set to 0.

... but each does so many things, I am not quite sure where I need to add what. I generally feel each should behave more like an Enumerator for an array.

  • The first option can be removed, that is a given by Enumerable.
  • If you call each without a block, it will return all results. again, Enumerable has a method called to_a, which does exactly that.
  • symbolize_keys could be removed, as it boils down to map(&:symbolize_keys).

I am not sure how I feel about cache_rows: it is a nice idea, but also introduces an issue: if you exit the enumerator early, incomplete results are cached.

result = client.execute("SELECT 1 as [one]; SELECT 2 as [two]; SELECT 3 as [three]")
result.to_a
# => [{"one"=>1}, {"two"=>2}, {"three"=>3}]

# exit early out of the iterator
result = client.execute("SELECT 1 as [one]; SELECT 2 as [two]; SELECT 3 as [three]")
result.each(cache_rows: true) { |r| break if r.key?("two")  }
# => nil
result.to_a
# => [{"one"=>1}]

# for comparison: here is how a regular array behaves
numbers = [1, 2, 3, 4, 5]
numbers.each { |n| break if n == 3 }
# => nil
numbers.to_a
# => [1, 2, 3, 4, 5]

I looked into the pg and trilogy gem, both do not add such a cache, so because of the error potential, I would likely remove it.

Then there are as, timezone and empty_sets, which directly impact how the TinyTds::Result will look. I would likely move them to execute as result_options. The main reason is: when calling any Enumerable method, it will mostly call each without any arguments. So you are forced to use the default options given by TinyTds. when we would pass these option from execute to the result, it could behave the same for any called method of Enumerable.


again, not sure where this all leads. but I think if we can fix this Assertion conn->in_net_tds == NULL' failed error while making the tiny_tds code a tad bit more modern / elegant, might have also help to avoid these random test crashes.

@andyundso
Copy link
Member

new week, new long comment. but this should hopefully be the last one before the PR, as I am quite clear on what to do next.

I went over my idea from last week (show an error when the user tries to run another execute without "consuming" the results first) in combination with the investigation about each once again. My idea would align what is documented in the README:

It is important that you either return the data from the query, most likely with the #each method, or that you cancel the results before asking the client to execute another SQL batch.

But I do not like this API. Needing to cancel a result when fetching something with each does not feel very Ruby to me.

Additionally, what I already showed earlier is that the results are incomplete when the cache by tiny_tds is activated (by default) and you return out of the iterator early:

# exit early out of the iterator
result = client.execute("SELECT 1 as [one]; SELECT 2 as [two]; SELECT 3 as [three]")
result.each(cache_rows: true) { |r| break if r.key?("two")  }
# => nil
result.to_a
# => [{"one"=>1}]

You can also force this into the other direction by deactivating caching and getting incomplete results on the seond each iteration:

result = client.execute("SELECT 1 as [one]; SELECT 2 as [two]; SELECT 3 as [three]")
# => 
# #<TinyTds::Result:0x000073c52bfa0848
# ...
result.each(cache_rows: false) { |r| puts r.inspect  }
# {"one"=>1}
# {"two"=>2}
# {"three"=>3}
# => []
result.each(cache_rows: false) { |r| puts r.inspect  }
# nil
# => []

This is due to the fact as from FreeTDS perspective, all results from the current query were consumed. you would have to resend the originally query to the server in order to get all results again.

Ultimately, there would be ways to fix this in this C code that it behaves correctly with each and avoid the error mentioned by @clintmiller originally (Abort trap: 6). But I feel it is not worth the effort: I would start working on a refactoring that would involve the following:

  1. Move do and insert onto the client class.
  2. Extend execute to also fetch all results for the given query. Modify TinyTds::Result to be a pure Ruby class without any C code.
  3. As a consequence: Extend execute to accept result options as, timezone and empty_sets. Drop the other options as they are no longer needed or anyhow already a given by Enumerable.

What advantages would give that?

  1. The entire FreeTDS query lifecycle (dbcmd, dbsqlsend, dbsqlok, dbresults, dbnextrow and dbcancel) would be managed from the Client only.
  2. Fetching all rows by default and storing it onto a pure Ruby result object avoids troubles with how each should behave with the rest of the FreeTDS query lifecycle. It is also a cache and guarantees that all rows have been fetched. It will result in slightly higher memory usage.
  3. Hopefully a bit less C code to maintain :)

I wrote this paragraph earlier in the afternoon and although it no longer fits into the original comment, I did not want to discard it.

I double-checked in the ruby pg gem and the trilogy gem how they handle fetching rows. libpq apparently does not offer any asynchronous mechanism like FreeTDS does. It always fetches all results. trilogy actually has an asynchronous mechanism implemented, but when fetching the results for the Ruby client, they also fetch all results (see here waiting until the entire response arrived and the loop here to fill-in the result object).

@aharpervc
Copy link
Contributor

aharpervc commented Oct 14, 2024

3. As a consequence: Extend execute to accept result options as, timezone and empty_sets. Drop the other options as they are no longer needed or anyhow already a given by Enumerable.

Can they still be read from TinyTds::Client.default_query_options by default?

Also, any thoughts on this statement in the readme?

By default row caching is turned on because the SQL Server adapter for ActiveRecord would not work without it. I hope to find some time to create some performance patches for ActiveRecord that would allow it to take advantages of lazily created yielded rows from result objects. Currently only TinyTDS and the Mysql2 gem allow such a performance gain.

@andyundso
Copy link
Member

Hi @clintmiller, do you maybe have a chance to try my proposed fix in #563? You once provided a spec where you could reproduce the error consistently, but it has been a year, so maybe this is no longer accurate.

@clintmiller
Copy link
Author

@andyundso I'll get it built and see what it does on my test case!

@clintmiller
Copy link
Author

@andyundso Quick update here- this PR seems to fix the specific issue I was running into. I'd probably have to fully roll-out our CI pipeline in GitHub Actions to see if it's really stable for us over a couple of days worth of runs. That will go on my todo list.

As an aside, I also have had to update from the MSSQL docker image mssql/server:2017-latest to mssql/server:2019-latest. We've had all kinds of relatively recent instability with the 2017 container images, where the database server process itself is dumping core- and it doesn't seem to be related to our application's use via tiny_tds. All that to say, it obfuscates things a bit- as in, "what change is really helping here?"

@andyundso
Copy link
Member

@clintmiller If you are using GitHub Actions ubuntu-latest image, it will now run Ubuntu 24.04, which from my understanding does not really play well with SQL Server 2017 in Docker, as that Docker image uses Ubuntu 18.04 under the hood. You can switch back to Ubuntu 22.04 by saying runs-on: ubuntu-22.04. I actually had to do the same for our pipeline for the PR I opened.

@andyundso
Copy link
Member

Hi @clintmiller , just wanted to check in and ask how the fix is performing. do you less or even no troubles on your CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants