Skip to content

Incorrect concurrent Postgres type map lookups by Hstore OID with multiple shards #54502

Open
@joshuay03

Description

@joshuay03

We ran into this at @buildkite when upgrading to 7.2.2.1.

This seems to have been introduced in e0a55b0 which was first included in 7.2, and happens to be fixed by 9ad36e0 which is currently only on main.

I think the issue is that before e0a55b0 when defining attributes, type lookups using the column (and therefore oid) only occurred inside load_schema!, right after schema load, and always with a mutex held by the caller (load_schema). After that change, these lookups happen inside _default_attributes, and it seems the schema may be re-loaded by another thread between column retrieval and lookup.

What I don't yet understand is that the schema cache and type map seem to be per-connection/shard, and the lookup checks its own connection out of the pool, so the oid used for the lookup and the oid in the type map should be the same. But in my debugging I found that the lookup oid would be for a different connection/shard to the one already in the type map. Okay I think I got it. I thought the columns_hash lookup was hitting the schema cache, but it's not, it's just stored on the class, which means a connection to another shard could have set it. Before, @columns_hash was set from the schema cache results and then its columns used for the lookup synchronously. Now, there may be a race condition between when the current thread sets it, and reads it, as another thread may have set it in-between when loading schema using another connection.

9ad36e0 then happens to fix this by removing the lookup altogether and storing the cast type on the column itself during schema load.

cc/ @jonathanhefner @rafaelfranca Sorry for the tags, but it would be great if 9ad36e0 or an alternative fix could be backported to 7.2 and 8.0. This was a bit of a nasty bug to run into in production as it resulted in some invalid data persisting due to incorrect deserialization, and was not straightforward to debug.

Steps to reproduce

# frozen_string_literal: true

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  # gem "rails", path: "./"
  # gem "rails", "7.1.5.1" # ✅
  gem "rails", "7.2.2.1" # ❌
  # gem "rails", github: "rails/rails", branch: "7-2-stable" # ❌
  # gem "rails", "8.0.1" # ❌
  # gem "rails", github: "rails/rails", branch: "8-0-stable" # ❌
  # gem "rails", github: "rails/rails", branch: "main" # ✅

  gem "pg", "~> 1.1"
end

require "active_record/railtie"
require "minitest/autorun"

class TestApp < Rails::Application
  config.load_defaults Rails::VERSION::STRING.to_f
  config.eager_load = true
  config.logger = Logger.new($stdout)
  config.secret_key_base = "secret_key_base"

  config.active_record.encryption.primary_key = "primary_key"
  config.active_record.encryption.deterministic_key = "deterministic_key"
  config.active_record.encryption.key_derivation_salt = "key_derivation_salt"
end
Rails.application.initialize!

# Note: This script pairs with a config/database.yml that has the following:
#
# development:
#   primary:
#     adapter: postgresql
#     database: development_primary
#   shard_001:
#     adapter: postgresql
#     database: development_shard_001
#   shard_002:
#     adapter: postgresql
#     database: development_shard_002
#
# and Postgres databases named `development_primary`, `development_shard_001`,
# and `development_shard_002`, each with the hstore extension enabled.

class ApplicationRecord < ActiveRecord::Base
  primary_abstract_class

  connects_to database: { writing: :primary, reading: :primary }
end

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
    t.hstore :metadata
  end
end

class ShardRecord < ApplicationRecord
  self.abstract_class = true

  connects_to shards: {
    shard_001: { writing: :shard_001, reading: :shard_001 },
    shard_002: { writing: :shard_002, reading: :shard_002 }
  }
end

[:shard_001, :shard_002].each do |shard|
  ShardRecord.connected_to(shard: shard) do
    ShardRecord.connection.create_table :posts, force: true do |t|
      t.hstore :metadata
    end
  end
end

class Post < ShardRecord
end

class BugTest < ActiveSupport::TestCase
  def test_association_stuff
    25.times.map do |i|
      Thread.new do
        [:shard_001, :shard_002].sample.tap do |shard|
          ShardRecord.connected_to(shard: shard) do
            Post.create!(metadata: {}) # raises `TypeError: can't cast Hash`
          end
        end
      end
    end.each(&:join)
  end
end

Expected behavior

All post records should be created successfully.

Actual behavior

TypeError: can't cast Hash is raised, full stack trace on 7.2.2.1:

  1) Error:
BugTest#test_association_stuff:
TypeError: can't cast Hash
    activerecord/lib/active_record/connection_adapters/abstract/quoting.rb:105:in 'ActiveRecord::ConnectionAdapters::Quoting#type_cast'
    activerecord/lib/active_record/connection_adapters/postgresql/quoting.rb:185:in 'ActiveRecord::ConnectionAdapters::PostgreSQL::Quoting#type_cast'
    activerecord/lib/active_record/connection_adapters/abstract/quoting.rb:227:in 'block in ActiveRecord::ConnectionAdapters::Quoting#type_casted_binds'
    activerecord/lib/active_record/connection_adapters/abstract/quoting.rb:225:in 'Array#map'
    activerecord/lib/active_record/connection_adapters/abstract/quoting.rb:225:in 'ActiveRecord::ConnectionAdapters::Quoting#type_casted_binds'
    activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:892:in 'ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#exec_no_cache'
    activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb:873:in 'ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#execute_and_clear'
    activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb:66:in 'ActiveRecord::ConnectionAdapters::PostgreSQL::DatabaseStatements#internal_exec_query'
    activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:159:in 'ActiveRecord::ConnectionAdapters::DatabaseStatements#exec_insert'
    activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb:85:in 'ActiveRecord::ConnectionAdapters::PostgreSQL::DatabaseStatements#exec_insert'
    activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:197:in 'ActiveRecord::ConnectionAdapters::DatabaseStatements#insert'
    activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb:27:in 'ActiveRecord::ConnectionAdapters::AbstractAdapter#insert'
    activerecord/lib/active_record/persistence.rb:258:in 'block in ActiveRecord::Persistence::ClassMethods#_insert_record'
    activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:421:in 'ActiveRecord::ConnectionAdapters::ConnectionPool#with_connection'
    activerecord/lib/active_record/connection_handling.rb:296:in 'ActiveRecord::ConnectionHandling#with_connection'
    activerecord/lib/active_record/persistence.rb:251:in 'ActiveRecord::Persistence::ClassMethods#_insert_record'
    activerecord/lib/active_record/persistence.rb:928:in 'block in ActiveRecord::Persistence#_create_record'
    activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:421:in 'ActiveRecord::ConnectionAdapters::ConnectionPool#with_connection'
    activerecord/lib/active_record/connection_handling.rb:296:in 'ActiveRecord::ConnectionHandling#with_connection'
    activerecord/lib/active_record/persistence.rb:925:in 'ActiveRecord::Persistence#_create_record'
    activerecord/lib/active_record/counter_cache.rb:201:in 'ActiveRecord::CounterCache#_create_record'
    activerecord/lib/active_record/locking/optimistic.rb:84:in 'ActiveRecord::Locking::Optimistic#_create_record'
    activerecord/lib/active_record/encryption/encryptable_record.rb:184:in 'ActiveRecord::Encryption::EncryptableRecord#_create_record'
    activerecord/lib/active_record/attribute_methods/dirty.rb:240:in 'ActiveRecord::AttributeMethods::Dirty#_create_record'
    activerecord/lib/active_record/callbacks.rb:445:in 'block in ActiveRecord::Callbacks#_create_record'
    activesupport/lib/active_support/callbacks.rb:101:in 'ActiveSupport::Callbacks#run_callbacks'
    activesupport/lib/active_support/callbacks.rb:914:in 'ActiveRecord::Base#_run_create_callbacks'
    activerecord/lib/active_record/callbacks.rb:445:in 'ActiveRecord::Callbacks#_create_record'
    activerecord/lib/active_record/timestamp.rb:116:in 'ActiveRecord::Timestamp#_create_record'
    activerecord/lib/active_record/persistence.rb:896:in 'ActiveRecord::Persistence#create_or_update'
    activerecord/lib/active_record/callbacks.rb:441:in 'block in ActiveRecord::Callbacks#create_or_update'
    activesupport/lib/active_support/callbacks.rb:101:in 'ActiveSupport::Callbacks#run_callbacks'
    activesupport/lib/active_support/callbacks.rb:914:in 'ActiveRecord::Base#_run_save_callbacks'
    activerecord/lib/active_record/callbacks.rb:441:in 'ActiveRecord::Callbacks#create_or_update'
    activerecord/lib/active_record/timestamp.rb:127:in 'ActiveRecord::Timestamp#create_or_update'
    activerecord/lib/active_record/persistence.rb:426:in 'ActiveRecord::Persistence#save!'
    activerecord/lib/active_record/validations.rb:54:in 'ActiveRecord::Validations#save!'
    activerecord/lib/active_record/transactions.rb:366:in 'block in ActiveRecord::Transactions#save!'
    activerecord/lib/active_record/transactions.rb:418:in 'block (2 levels) in ActiveRecord::Transactions#with_transaction_returning_status'
    activerecord/lib/active_record/connection_adapters/abstract/transaction.rb:616:in 'block in ActiveRecord::ConnectionAdapters::TransactionManager#within_new_transaction'
    activesupport/lib/active_support/concurrency/null_lock.rb:9:in 'ActiveSupport::Concurrency::NullLock#synchronize'
    activerecord/lib/active_record/connection_adapters/abstract/transaction.rb:613:in 'ActiveRecord::ConnectionAdapters::TransactionManager#within_new_transaction'
    activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb:361:in 'ActiveRecord::ConnectionAdapters::DatabaseStatements#transaction'
    activerecord/lib/active_record/transactions.rb:414:in 'block in ActiveRecord::Transactions#with_transaction_returning_status'
    activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb:427:in 'ActiveRecord::ConnectionAdapters::ConnectionPool#with_connection'
    activerecord/lib/active_record/connection_handling.rb:296:in 'ActiveRecord::ConnectionHandling#with_connection'
    activerecord/lib/active_record/transactions.rb:410:in 'ActiveRecord::Transactions#with_transaction_returning_status'
    activerecord/lib/active_record/transactions.rb:366:in 'ActiveRecord::Transactions#save!'
    activerecord/lib/active_record/suppressor.rb:56:in 'ActiveRecord::Suppressor#save!'
    activerecord/lib/active_record/persistence.rb:55:in 'ActiveRecord::Persistence::ClassMethods#create!'
    test.rb:87:in 'block (4 levels) in BugTest#test_association_stuff'
    activerecord/lib/active_record/connection_handling.rb:376:in 'ActiveRecord::ConnectionHandling#with_role_and_shard'
    activerecord/lib/active_record/connection_handling.rb:147:in 'ActiveRecord::ConnectionHandling#connected_to'
    test.rb:86:in 'block (3 levels) in BugTest#test_association_stuff'
    <internal:kernel>:91:in 'Kernel#tap'
    test.rb:85:in 'block (2 levels) in BugTest#test_association_stuff'

System configuration

Rails version: 7.2.2.1, but also tested against 7.1.5.1, 7-2-stable, 8.0.1, 8-0-stable, and main as noted at the top of the script

Ruby version: 3.3.7

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions