Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeSetAttribute - upgrade path issues, corruption #377

Closed
scode opened this issue Oct 16, 2017 · 0 comments
Closed

UnicodeSetAttribute - upgrade path issues, corruption #377

scode opened this issue Oct 16, 2017 · 0 comments
Labels
Milestone

Comments

@scode
Copy link
Contributor

scode commented Oct 16, 2017

The behavior of UnicodeSetAttribute has changed in non-backwards compatible ways as of the 1.6.0 and the 3.0.1 release of PynamoDB. In versions prior to 1.6.0, unicode sets would be represented in a way that was compatible with other PynamoDB readers, but did not accurately represent the data in DynamoDB due to use of JSON encoding. Using Pynamo<1.6.0 to write {'test'} would result in the string "test" ending up in the database (but the read path would strip that off again).

In #151, which went into the 1.6.0 release, the behavior was changed such that writes would correctly represent the string in the database. In an attempt to be backwards compatible however, the read path would "taste" the data by attempting to decode it as JSON. If it decoded correctly, it was assumed the data was written with the older version of Pynamo and the string given to us by the JSON parser would be returned. As a result, using that version of Pynamo to write {'"test"'} would result in {'test'} being read back - effectively corrupting data.

In #294, which went into the 3.0.1 release, the behavior was changed again such that the read path no longer did the JSON decode tasting.

The upshot is this:

  • If your application does not use UnicodeSetAttribute, you're good.
  • If your application only ever ran with 3.0.1 or later, you're good.
  • Upgrading from PynamoDB < 1.6.0 to Pynamo >= 1.6.0 <3.0.1
    • may result in incorrect data being read if and only if you stored valid JSON in a string in a set read using UnicodeSetAttribute
    • causes rolling upgrades (where readers are on the older version while writes write using the new version) to break because readers cannot read values written by new writers that do not encode as JSON
    • reverting the upgrade is not possible
  • Upgrading from PynamoDB < 1.6.0 to PynamoDB >= 3.0.1
    • will result in strings written using UnicodeSetAttribute being read back incorrectly in JSON encoded form just like a non-PynamoDB client would have read them.
    • causes rolling upgrades (where readers are on the older version while writers write using the new version) to break because old readers cannot write values written by new writes that do not encode as JSON
    • reverting the upgrade is not possible
  • Upgrading from PynamoDB >= 1.6.0 < 3.01 to PynamoDB >= 3.0.1
    • will work; any data items read incorrectly under the previous version will stop reading incorrectly. if you application relies on reading items incorrectly, it may break.

Some adapted example code can be used to repro:

from pynamodb.models import Model
from pynamodb.attributes import UnicodeAttribute
from pynamodb.attributes import UnicodeSetAttribute

class UserModel(Model):
    """
    A DynamoDB User
    """
    class Meta:
        table_name = "dynamodb-user"
        host = "http://localhost:9204"
    email = UnicodeAttribute(null=True)
    first_name = UnicodeAttribute(range_key=True)
    last_name = UnicodeAttribute(hash_key=True)
    tags = UnicodeSetAttribute()

UserModel.create_table(read_capacity_units=1, write_capacity_units=1)

for user in UserModel.query("Smith", first_name__begins_with="J"):
    print(user.first_name)

user = UserModel("John", "Denver")
user.tags = {'"test1"', 'test2'}
user.save()

try:
    user = UserModel.get("John", "Denver")
    print(user)
    print user.tags
except UserModel.DoesNotExist:
    print("User does not exist")

On a >= 3.0.1 version this works correctly and prints these tags:

set([u'"test1"', u'test2'])

On a version prior to 3.0.1, it strips off the double quotes:

set([u'test1', u'test2'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants