Skip to content
This repository has been archived by the owner on Nov 20, 2018. It is now read-only.

Thread safety problems? #45

Closed
myitcv opened this issue Feb 19, 2014 · 7 comments
Closed

Thread safety problems? #45

myitcv opened this issue Feb 19, 2014 · 7 comments
Labels

Comments

@myitcv
Copy link

myitcv commented Feb 19, 2014

I have created a heavily threaded test rig to try and point out what I think are thread-safety problems in aws-sdk-core. Before diving too far into investigation I wanted to throw this a bit wider to see if I'm missing something.

These tests were carried out in the following environment:

$ uname -a
Linux myitcv-virtual-machine 3.11.0-17-generic #31-Ubuntu SMP Mon Feb 3 21:52:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ ruby -v
rubinius 2.2.5 (2.1.0 e543ba32 2014-02-08 JI) [x86_64-linux-gnu]
$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 70
Stepping:              1
CPU MHz:               2594.193
BogoMIPS:              5188.38
Hypervisor vendor:     VMware
Virtualisation type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
L4 cache:              131072K
NUMA node0 CPU(s):     0-3

Source code for v1 of the test rig and the accompanying Gemfile behind those links. bundle install to get up and running.

The reason for using celluloid is that we are building a process atop celluloid hence the test is more fair (but admittedly not fully stripped back to bare Ruby/Rubinius)

  • access_key_id etc will need to be populated before using test.rb
  • The code creates a pool of 50 threads, then makes 100 async calls into that pool
  • Each call makes a call to DynamoDB to list tables
  • After bundle install, ruby test.rb (assuming you have the right Ruby interpreter set via rbenv etc) should be enough
  • Yes, this line could be made more efficient but leaving it as such makes the thread safety problem more apparent (see later discussion about a revised versions v2 and v3)

This should be as vanilla as it gets, yet there are three types of exception I've been hitting. But not consistently which is what leads me to believe there's a thread safety issue. They are:

  1. Unrecognized properties exception
  2. Tuple out of bounds exception
  3. Invalid signature exception

Looking at the top of the call stack of exception 1, we are taken to this code. There are lots of class instance variables here which I don't believe are thread safe unless I'm missing something about how this get's called?

v2 and v3 present alternatives which create one Aws::DynamoDB instance per thread and globally respectively. Both suffer similar issues to varying degrees. The one regularly occurring common error between all three versions is point 3 above, the invalid signature error.

Before we look any further, is there an assumed usage pattern here? i.e. should one create a single, global Aws::DynamoDB instance, or one per thread, or per call?

Any thoughts on what the issue is here?

Are any of these issues potentially related to #43?

@trevorrowe
Copy link
Contributor

I haven't had a chance to dig into this yet, but I plan to. There is no assumed usage pattern. You should be able to create one or many clients. A constructed client should be thread safe.

@myitcv
Copy link
Author

myitcv commented Feb 21, 2014

@trevorrowe - thanks very much

@trevorrowe
Copy link
Contributor

I've been working on a number of the threading issues today and I wanted to respond to this issue as well. The core of this issue does stem from a number of thread-safety issues in the Seahore::Model classes and modules.

I have been working on a major update to aws-sdk-core, specifically the Seahorse::Model classes. In this update, I have been addressing the thread safety issues. There is an experimental branch (API clients are stable except for EC2) that has the model changes here: https://github.com/aws/aws-sdk-core-ruby/tree/normalized-resources. I will try to clean that up and push a separate branch that lacks some of the other experimental features (resources) from the linked branch.

To answer one of your other questions, I do believe the signature errors are related to #43. I recently pushed a commit that should resolve those issues. We ran into a similar issue in the aws-sdk gem and were able to resolve it by using OpenSSL::Digest instead of Digest. See the related commit here: 83883cb for this library and the aws-sdk issue here: aws/aws-sdk-ruby#529

If I can find the time this week, I'll try to run your test suite against the normalized branch and report back with some results. Thanks for your patience and excellent sleuthing.

@trevorrowe
Copy link
Contributor

BTW, v2.0.0.rc9 has been released with the OpenSSL fix. I'm hoping that should resolve the signature errors.

@lardcanoe
Copy link

Changes in rc9 have been working great for me in multi-threaded code. The only issue I am seeing, which is just warning, are lines like:

..../vendor/jruby/1.9/gems/aws-sdk-core-2.0.0.rc9/lib/aws/service.rb:183 warning: already initialized constant V20100515

when the threads initially start loading and making requests.

I haven't tried out the experimental branch to see if I get this there as well.

@trevorrowe
Copy link
Contributor

Thats good news. The warnings about constants should be easy to straighten out. Let me see what I can do for that.

@trevorrowe
Copy link
Contributor

I'm going to close this issue, as the original problem has been addressed. I've created a new issue to track the warning generated by const_set.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants