-
Notifications
You must be signed in to change notification settings - Fork 308
deduplicate routes and customers #3806
Comments
Can all duplicates be traced to an absorption? |
I find 12 |
11 are explained by absorptions. |
For the twelfth, the two participant accounts are clearly a duplicate of each other, but it's not clear how they both ended up with the same |
The twelfth only has a bank account, not a credit card. |
For the 11 absorptions, the path is clear: delete |
For the twelfth, it's clear that one is the primary account and the other is secondary, based on the amount of tip & withdrawal activity. There's only one withdrawal against the secondary account. I propose that we:
|
Looks like we've got a similar situation to #2085 but with credit cards (not with bank accounts[!]): => select count(*) from exchange_routes where address like 'CC%';
┌───────┐
│ count │
├───────┤
│ 3619 │
└───────┘
(1 row)
=> select count(*) from exchange_routes where address like '/cards/CC%';
┌───────┐
│ count │
├───────┤
│ 4018 │
└───────┘
(1 row) |
Planning to fix with: UPDATE exchange_routes
SET address='/cards/'||address
WHERE address like 'CC%'; |
Wrong! Those are |
Actually, I don't think we want to deduplicate routes for archived participants. The reason is that the histories are in fact kept separate. In this case, it's appropriate to have separate |
... which means we should back out #3806 (comment). |
Done. |
#!/usr/bin/env python -u
from __future__ import absolute_import, division, print_function, unicode_literals
from gratipay import wireup
db = wireup.db(wireup.env())
with db.get_cursor() as cur:
customers = cur.all("""\
SELECT * FROM (
SELECT DISTINCT ON(balanced_customer_href)
balanced_customer_href, count(username) n, array_agg(username) usernames
FROM participants
WHERE balanced_customer_href is not null
GROUP BY balanced_customer_href
) _ ORDER BY n DESC
""")
for rec in customers:
assert rec.n in (1, 2)
if rec.n == 2:
print(rec.n, rec.balanced_customer_href, ", ".join(rec.usernames), end='')
one, two = rec.usernames
absorption = cur.one("""\
SELECT *
FROM absorptions
WHERE (archived_as=%(one)s AND absorbed_by=%(two)s)
OR (archived_as=%(two)s AND absorbed_by=%(one)s)
""", dict(one=one, two=two))
print(":", absorption.id if absorption else None)
if absorption is None:
print(' |')
print(' |')
continue
print(" | Should unset balanced_customer_href for {}.".format(absorption.archived_as))
good_routes = cur.all("""\
SELECT er.*, p.username, p.id as user_id
FROM exchange_routes er
JOIN participants p ON er.participant = p.id
WHERE p.username=%s
AND network='balanced-cc'
""", (absorption.absorbed_by,))
if good_routes:
assert len(good_routes) == 1
bad_routes = cur.all("""\
SELECT er.*, p.username, p.id as user_id
FROM exchange_routes er
JOIN participants p ON er.participant = p.id
WHERE p.username=%s
AND network='balanced-cc'
""", (absorption.archived_as,))
if bad_routes:
assert len(bad_routes) == 1
if good_routes and (bad_routes[0].id == good_routes[0].id):
print(" | True duplicate! {}".format(bad_routes[0].id))
elif good_routes and (bad_routes[0].address == good_routes[0].address):
print(" | Should fold route for {} into route for {}."
.format(bad_routes[0].user_id, good_routes[0].user_id))
else:
print(" | Archived participant has an unduplicated route!")
else:
print(' |')
if 0 and absorption is not None:
cur.one("""\
UPDATE participants
SET balanced_customer_href=null
WHERE username=%s
""")
print()
raise Exception # trigger rollback |
Closing as wont-fix based on this. |
Reticketed from #2779. We don't have unique
(network, address)
in exchanges, norparticipants.balanced_customer_href
.The text was updated successfully, but these errors were encountered: