-
Notifications
You must be signed in to change notification settings - Fork 10
Dump file is growing unsustainably fixes 394 #404
Dump file is growing unsustainably fixes 394 #404
Conversation
jobs/buildhub/to_kinto.py
Outdated
) | ||
# XXX Not sure why this needs to be a "quotation mark" | ||
# encoded thing | ||
previous_run_timestamp = '"%s"' % highest_timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ETag
header is the timestamp between quotes (?_since="1234"
).
Maybe you could rename the variable to previous_run_etag
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it as per your suggestion.
with open(self.cache_file) as f: | ||
records = json.load(f) | ||
assert len(records) == 1 | ||
assert records['a'][0] == 2 # [0] is last modified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not feel like you test the migration here. Add another record not returned in get_records()
to assert that previous entries were migrated too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my updated commit. Better?
@leplatrem Can you look over the latest commits after you approved. The reason it was failing was because the unit tests were leaking. Their ordered mattered whether the tests would pass or if one would fail. |
def record_unchanged(record): | ||
return ( | ||
record['id'] in existing and | ||
existing.get(record['id']) == hash_record(record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since hash_record()
never returns None
, you can write:
return existing.get(record['id']) == hash_record(record)
and thus even maybe drop the use of this record_unchanged()
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if I do that I have to execute hash_record(record)
just so that it can ultimately become: if None == 'somehashstringhere':
.
The first part of the if helps figure out if it's worth bothering calculating a hash. By doing both things the if statement becomes (if you break it down in steps): if False and existing.get(record['id']) == hash_record(record):
which means it exists out on the first False
and doesn't bother executing the existing.get(record['id']) == hash_record(record)
part at all.
Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
jobs/tests/test_to_kinto.py
Outdated
# The migration is done, but let's make sure the fetch_existing | ||
# continues to work as expected. | ||
mocked.get_records.return_value = [ | ||
{'id': 'a', 'title': 'b', 'last_modified': 2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the migration is done, I suppose it should use the latest Etag, and this record does not seem unchanged, so it should not be listed here.
In order to test that the entries present in the old cache effectively end up in the new cache, it would make sense not to list this record here, so that when you make sure it is present at the end, you know it's taken from the old cache file
No description provided.