Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chrono privacy: no longer return create attribute in API #1295

Merged
merged 14 commits into from
May 5, 2024
Merged

Conversation

elrido
Copy link
Contributor

@elrido elrido commented May 1, 2024

This PR addresses a concern brought up in #1290

Changes

  • stops adding a create attribute to the pastes metadata
  • removes create or postdate attributes in pastes metadata, if they still exist
  • drops postdate column in paste table

ToDo

  • discuss: in the paste comments, the create date is internally used to sort the comments, as well as being displayed on the rendered comment section. Should we add an option to disable comment date display? If yes, we would still add and store it internally for sorting, but strip it before returning the comments array.
  • await merge of bootstrap 5 template, to ensure SRI gets updated in there as well

Copy link
Member

@rugk rugk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I don't see comments handling here? Likely because you did not change it – that still returns the creation time of the comments, does not it? Edit: ah, read it now.

doc/Installation.md Show resolved Hide resolved
@rugk
Copy link
Member

rugk commented May 1, 2024

Should we add an option to disable comment date display? If yes, we would still add and store it internally for sorting, but strip it before returning the comments array.

Yeah, would be fine for me and would be an advantage.

Alternatively, for improved zero knowledge, we could make the DB field nullable and never store it if the setting is disabled. The Frontend should then be able to handle non-timestamp entries there gracefully and just don't show a time then, i.e. if the setting is changed it could mean we have some comments shown with a timestamp and some without.
IMHO, that could increase privacy and is an acceptable and totally understandable (logical) tradeoff, when that setting is disabled.

rugk
rugk previously approved these changes May 1, 2024
@nain-F49FF806
Copy link

nain-F49FF806 commented May 1, 2024

@elrido Thanks for implementing this feature! Looks good.

I think rugk's suggestion for comments also makes sense. When enabling the option to not display comment dates, perhaps we should not even do any server side sorting? (and thus not have need to store the created date at all).

Most database backends will still likely return the comment entries in order they were added (even without a timestamp key for sorting). Even if they don't, it's not a big deal.

Even unsorted comments still have a tree structure because of their parent_id. So any conversation in the discussion will still be followable as comment threads.

@rugk
Copy link
Member

rugk commented May 1, 2024

Most database backends will still likely return the comment entries in order they were added (even without a timestamp key for sorting). Even if they don't, it's not a big deal.

Well… the file backend likely does not guarantee this (but this is just my wild guess). But thanks yeah did not think of sorting when suggesting that we do not save the data at all.

One thing I'd like to avoid and disagree here is we should keep the order, always. Otherwise this can change meaning in the discussion when no tree-like structure was used. And as meaning is a content, we should mot introduce changes there… (Just imagine a comment "The previous comment is fake!" etc.)
An alternative would be just to introduce (if not already there) a sequential ID (like a primary key in a DB) which you can use for sorting rather than a timestamp. Also, that should be faster/needs less CPU power…

@nain-F49FF806
Copy link

nain-F49FF806 commented May 1, 2024

(Just imagine a comment "The previous comment is fake!" etc.)

I think such comments are best done as replies. Maybe it's a personal influence of platforms like Reddit / Instagram, where one does not rely on ordering of comments at the same level. Because comments are almost always reordered based on upvotes / algorithm or what not. In such platforms, the reply feature exists primarily for this use case, to refer explicitly to some previous comment.

@nain-F49FF806
Copy link

nain-F49FF806 commented May 1, 2024

An alternative would be just to introduce (if not already there) a sequential ID (like a primary key in a DB) which you can use for sorting rather than a timestamp. Also, that should be faster/needs less CPU power…

Yes, this would be good!. I thought of suggesting this too (many databases have increment feature), but since this would not be applicable for file backends, it seemed an incomplete solution.

I do feel that even without temporal order, comments have value. If the ordering is relevant, the option to display dates can be left enabled.

One way to look at this is that other than sensitive stuff like IPs, the server shouldn't be storing stuff, processing the hidden data and stripping it before sending to clients.

Most of the logic should be client side. It should be explicit (as much as possible).

@elrido
Copy link
Contributor Author

elrido commented May 2, 2024

So the problem that the unix timestamp solves for us, is that only the database backend stores the value separately, making it easy to sort. In all other backends, that value is stored inline and the order is not guaranteed[0]. That means to sort by it, all comments have to first get read, kept in memory and sorted there. Now, yes, in the database backend we could simply use the auto-incrementing primary key to sort by. But in the other backends (files & cloud storage) we would need to add a new "bucket" to track that serial number per paste and then hope we don't run into concurrency or locking issues if several comments occur at the same time.

In short, the current scheme is simple (and elegant?) in that it requires only storing and reading whole pastes and comments and no updates (only deletes of the whole things). Comments can be created without any knowledge of other comments. The unix timestamp happens to be roughly monotonic and for the cases it is not, the only side-effect is that some comments may get ordered incorrectly (i.e. during leap second smearing, hard NTP time shifts or if posted in the same second).

This is why I asked, as it could turn into a quite severe rewrite of the backend storage if we wanted to remove the timestamps from the comment storage, the way I did it for the pastes (where it really serves no purpose).

Since you all seem in favor to remove it from comments as well, I will add the configuration option for this ("display comment creation time", default true) and remove the internal value from being returned if that gets disabled, but for now will keep the internal time-based mechanism.

If you can think of any other value that isn't time (or at least not by itself), but available to PHP and at least roughly monotonically increasing, we could swap that in (kinda, existing comments would need to keep using time, but new pastes could be marked to allow for this new value). To work for all backends, it would need to be something that requires no extra storage and no synchronization between requests. It needed to be something inherent to the PHP runtime, that survives PHP restarts, server migrations, etc.

[0] for example in the filesystem backend, the comments are stored as files under a name made from paste, parent and comment IDs, and at best these are returned in lexical order (though some filesystems don't guarantee that either).

cfg/conf.sample.php Outdated Show resolved Hide resolved
lib/Model/Paste.php Show resolved Hide resolved
elrido and others added 6 commits May 4, 2024 15:44
Co-authored-by: rugk <rugk+git@posteo.de>
PHP 8.2 deprecates implicit conversion from float to int if it loses precision, hence the explicit conversion. I missed these in 6bcef2f
@elrido elrido merged commit 843aa00 into master May 5, 2024
16 of 17 checks passed
@elrido elrido deleted the chrono-privacy branch May 5, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants