Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow trashed events to be deleted and at the same time prevent them from orphaning other data records GDPR #349

Closed
sethshoultes opened this issue Apr 25, 2018 · 19 comments

Comments

@sethshoultes
Copy link
Contributor

See original Codebase ticket:
https://events.codebasehq.com/projects/event-espresso/tickets/9009

This has come up in the forums and I think there's some room for improvement when it comes to making things easier for event managers.

Steps to reproduce:

  1. Set up an event and assign some people CPTs to the event, save the event.
  2. Trash the event
  3. Go the trashed events view in the wp admin list table
  4. check the box next to the event
  5. select the Delete Permanently action and click Apply

Result is you'll see some errors that say you can't delete the event. So now you have to go and restore the event from the trash, then you have to edit the event, and remove all of the relationships to things like People CPts. Then trash, then you can permanently delete.

Can we change the above so the relationships automatically get removed so all they'll have to do is trash then delete the event?

@mnelson4
Copy link
Contributor

Hey Seth ya this sounds like an important feature for allowing admins to clear out old events, but I'm not sure why this is categorized as a security issue.

@sethshoultes
Copy link
Contributor Author

Oops. Didn't mean to do that. I updated the category to GDPR.

@joshfeck
Copy link
Contributor

@stale stale bot added the status:stale label May 10, 2019
@eventespresso eventespresso deleted a comment from stale bot May 14, 2019
@stale stale bot removed the status:stale label May 14, 2019
@joshfeck
Copy link
Contributor

Would this also allow for deleting events that have registrations?

@mnelson4
Copy link
Contributor

I don't think it's been specified. What would customers probably want to happen when deleting an event with registrations? orphan them? block deletion of the event until registrations are deleted? automatically delete all the registrations too?

@joshfeck
Copy link
Contributor

The current behavior is it blocks deletion of the event until registrations are deleted.

Would it be OK to instead allow the event to be deleted and orphan the registrations?

@mnelson4
Copy link
Contributor

Would it be OK to instead allow the event to be deleted and orphan the registrations?

It would certainly test how defensive our code- the code should generally just say "oups, can't find the event for this registration" and then keep working. But there may be the odd fatal error that would happen because we got careless.

Although I've reviewed this ticket some more: I think this work would only help with GDPR and user privacy etc if it actually facilitated deleting registrations/contacts. Just deleting the event and orphaning the registrations doesn't help with GDPR because the personal information (on the registration and contact) is still in the system.

I'm actually not sure what use-case facilitating just deleting the event (and orphaning its related data) would benefit.

So I would think this is really only useful if it will also delete the personal information contained on registrations and contacts etc.

@eventespresso eventespresso deleted a comment from stale bot Oct 23, 2019
@eventespresso eventespresso deleted a comment from stale bot Oct 23, 2019
@eventespresso eventespresso deleted a comment from stale bot Oct 23, 2019
@eventespresso eventespresso deleted a comment from stale bot Oct 23, 2019
@joshfeck
Copy link
Contributor

joshfeck commented Oct 24, 2019

Since there's likely a lot of data that may need to be deleted, would this be something we'd use the Batch system to break things up into smaller requests?

On a site without MER, it'd probably be fairly straightforward to delete an event and its related data:

Delete:

[ ] - event post in wp_posts
[ ] - any post meta
[ ] - datetimes
[ ] - tickets
[ ] - prices
[ ] - registrations
[ ] - transactions
[ ] - line items
[ ] - payments
[ ] - relationship records to wait lists, people CPTs, Venues

Do not delete

[ ] - Contacts related to the registrations would remain because you can have contacts without a registration, and there may be other registrations for other events attached to any given Contact.

[ ] - Where things could get sticky is on sites where MER is activated. Maybe that would require doing a check on the transactions and if the transaction has line items from other events, then the transaction needs to persist.

@mnelson4
Copy link
Contributor

Since there's likely a lot of data that may need to be deleted, would this be something we'd use the Batch system to break things up into smaller requests?

Yes I think that would be appropriate (especially for those events with hundreds of attendees).

Do not delete... Contacts

I suspect users will be divided on this: some will want the contacts gone too (eg if the contacts are only for one event and the event manager is trying to adhere to GDPR's "Privacy by Design" and delete old data), while others will want to keep them. It's a similar situation with venues (and transactions with MER, like you mentioned), because those can be for different events.

I think it would be best if we asked users if they want to delete contacts, transactions, venues, etc (provided they're not in use by other events); and it would be nice to have a confirmation page (where we display what data, including related data, will get deleted)

@joshfeck
Copy link
Contributor

I don't think venues should be deleted when an event is deleted. The relationship between an event and venue would be deleted, but allowing a venue to be deleted when an event is deleted would probably lead to accidental deletion of venues. A user can delete venues already via the Venues list table.

@mnelson4
Copy link
Contributor

mnelson4 commented Oct 31, 2019

Here's my attempt at "shaping" this task. Up for debate.

The Problem

Users often want to delete events to either clear our their event backlog, or in order to remove personal information of contacts.
Right now, to delete an event, you need to first:

  • delete the events registrations, but first...
  • delete their transactions, but first...
  • delete their payments (by manually going into each transaction and deleting them)

The Appetite

I'd say this is a fairly major administrative feature. While its possible to work around it, it's not reasonable for events with over a dozen-or-so registrations.
@joshfeck are we primarily just interested in facilitating deleting events, or other things too? (eg transactions, registrations, question groups, etc) If the former, the code can be quite event-specific. If the latter, its especially time-effective to create a generalized system for handling recursive deletions.

UI Considerations

For the first version, I'd like to not bother with any options, and just delete the essential stuff. Following Josh's list IMO that's:

  • event meta
  • datetimes
  • tickets
  • prices for those tickets (not general price modifiers or default prices)
  • message templates
  • registrations
  • registration answers
  • relations to wait lists, people, venues

Contacts and transactions (and the transaction's associated payments and line items) can exist OK without associated events. So for the initial version of this, I'm inclined to leave them.

@joshfeck do we want to have a confirmation page? A page that could confirm everything that will get deleted with the event. Something like this (from Django):
screenshot-dev earlychildhoodeducator com-2019 10 31-13_05_12 If not, after the user clicks to delete an event, maybe we'd show a warning, then immediately start deleting it and its associated data.

When it comes time to do the actual deleting, I think we'd send the user to a batch process page, where they could see the progress of the deletion.

Code Considerations

There are two main approaches I see for implementing this: A) select then start deleting, or B) just immediately start deleting.
Option A makes sense if there's a confirmation page, as we'd first grab all the data we want to delete, have the user confirm its deletion, then deleting should be quicker because we already have the items identified.
Option B might be less work if there's no confirmation page.

Regardless, we'll need a way to traverse the tree of model objects. Eg you pass the event into a model instance (ie an EE_Event), and then it looks at all the models that depend on it (ie, EE_Has_Many_Relation that are set to block deletions, and EE_HABTM_Relations's join tables), and fetches all the items that directly rely on it (ie have a foreign key field that points to that instance), and recurse.

So for an event that algorithm would visit:

Event
-Registration
--Answer
--Checkin
--Registration_Payment
--(not Messages because they don't block deletions)
-Datetime
--Checkin
--Datetime_Ticket
-Event_Question_Group
-Term_Relationship
-Event_Message_Template_Group

There are a few more models we would probably want to delete, just because they aren't accessible in the UI without going through an event (even though theoretically there could be an admin page listing them independently of events):

Ticket
-Datetime_Ticket
-Registration
--Answer
--Checkin
--Registration_Payment
-Ticket_Price
Price
Message_Template_Group
-Message_Template
-Message

Because it's the controller-layer (EE_Admin_Page children) that decided these models shouldn't exist independently of an event (not the database layer), I think it's appropriate for the logic that triggers their cascading deletion should also exist in the same layer.

And, for a v2, I would suggest we could then give users the option to also delete:

Attendee
-Registration
--Answer
--Checkin
--Registration_Payment
-Message
-Term_Relationship
Transaction
-Registration
--Answer
--Checkin
--Registration_Payment
-Payment
--Registration_Payment
-Line_Item
-Message

And each of their postmetas, extra metas, and extra join entries.

If we want a confirmation page, it might be tricky to generate that page for very large events and avoid timeouts. It seems reasonable to just fetch the first 100 or 200 items; and then say "...more". If that's not ok, we'd probably need another batch job for first fetching all the data to delete (in addition to the batch job for processing deletions.)

I think option A (fetching all the items-to-delete before actually beginning to delete them) is probably more useful. Besides allowing us to have a confirmation page, we'll know how many items to delete so we can display a proper progress bar, and we could probably write more efficient SQL delete queries (eg delete all the event's registrations' answers in a single query rather than one query per registration). It may also be useful even when we don't want to recursively delete, because we can show the user what data is blocking deletions.

Rabbit Holes to Avoid

Transactional queries

What happens if a deletion query fails mid-query? If it were all part of a database transaction, we could revert what had already been deleted, in order to keep the database in a "valid" state.
But we haven't used transactional database queries in production yet. They'll be tricky to unit-test (because the unit tests use a transaction for every testcase, and you can't really nest database transactions), the model objects will get out-of-sync, and lastly the queries are being ran across multiple HTTP requests. So IMO transaction queries sound like they'll be difficult and beyond the scope of this project.

If we first select everything to delete, store it somewhere, then delete it as part of a batch process, we'll be keeping track of the progress as we go. If a problem occurs mid-request, we can pickup where we left off (that's what the batch system does naturally.)

Refactoring the models

I don't think this will necessitate refactoring models, although I'm sure we'll see plenty of ways to improve them. I'd like to build this as a separate system that uses the models, but not as part of the models. We may need to store a little more information about model relations (eg across which relations do we want to propogate the deletion), but I think we already have what we need.

Any other rabbit holes where we could get lost (ie, consume a ton of time without doing anything currently needed) working on this?

I think I'd first like Josh's feedback on the UX and rabbit holes to avoid, then get Brent's feedback on the implementation details

@joshfeck
Copy link
Contributor

joshfeck commented Nov 4, 2019

Contacts and transactions (and the transaction's associated payments and line items) can exist OK without associated events.

This is true, but as it stands now, a transaction must have at least one registration. Otherwise there's a fatal error:

Fatal error: Uncaught Error: Call to a member function reg_code() on bool in /Applications/MAMP/htdocs/ee4/wp-content/plugins/32-core/admin_pages/transactions/Transactions_Admin_Page.core.php:1046

We could make the above more defensive to avoid fatal errors. Not deleting transactions does solve the problem of what to do about MER and how it allows a transaction to have registrations from several events. So no issue if all registrations from one event are deleted, and some transactions have registrations from that event and registrations from other events.

The downside would be there's currently no UI to delete transactions, which between those and line items it can also add quite a bit of data onto the database.

do we want to have a confirmation page?

Either a confirmation page or an alert that allows the user to get out if they got there by mistake.

@mnelson4
Copy link
Contributor

mnelson4 commented Nov 8, 2019

The downside would be there's currently no UI to delete transactions, which between those and line items it can also add quite a bit of data onto the database.

I'd be inclined to get this working without deleting transactions (assuming it doesn't take a crazy amount of effort to avoid that fatal error you mentioned and others like it), and then add the ability to delete transactions on a separate ticket.

Either a confirmation page or an alert that allows the user to get out if they got there by mistake.

I'm inclined to do the confirmation page because it's also an opportunity to point out all the related data that will get deleted...

Any further thoughts on that? If not, we should probably get feedback from Brent next.

@joshfeck
Copy link
Contributor

joshfeck commented Nov 8, 2019

It probably wouldn't take much to make the code a bit more defensive.

@mnelson4
Copy link
Contributor

KK, I'd like to make a p2 with a proposal, especially looking for feedback from Brent and anyone who wants to chime in.

@mnelson4
Copy link
Contributor

Here's the p2, based on my earlier post on this ticket, but I removed some of the questions because I get the impression we're inclined to have the confirmation page, and will leave transactions alone initially. (If that' incorrect chime in here or there.)

While I'm open to feedback from everyone, I think only Brent's additional blessing is needed to move forward.

@mnelson4
Copy link
Contributor

Ok so the main feedback from the p2 so far is:

  • make sure users have a database backup before deleting
  • have users confirm the event they're wanting to delete (like how GitHub gets you to type in the repo's name). The details of this are being discussed
  • business logic would probably fit in /core/services/admin or even /core/services/orm folder
  • if more meta-info is required to make this work, it may be best to store in model-specific config files, rather than directly on the models
  • using the command bus seems appropriate
  • log when events were deleted (although having a UI to show them isn't yet essential)

@mnelson4
Copy link
Contributor

mnelson4 commented Nov 22, 2019

The plan:

Created
#1904
#1905
#1906

@mnelson4
Copy link
Contributor

I've created sub-tickets for the different parts of this project, so closing this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants