Skip to content
This repository has been archived by the owner on Aug 11, 2024. It is now read-only.

Fetch basic user metadata from SQLite if available #925

Merged
merged 28 commits into from
Oct 12, 2023

Conversation

bfollington
Copy link
Collaborator

@bfollington bfollington commented Oct 5, 2023

Fixes #924

This turned into quite a meaty refactor. The goal is to avoid hitting the network to produce the feed or our local profile view. To accomplish this we had to revisit our entire approach to loading and caching user profiles. While a lot of code changed, the new approach is much simpler: we have two code paths, one for loading user data for list views and another for the profile detail view.

I've tested loading the feed with several hundred posts in it from various peers and it loads instantly for me without reading any of the peer's _profile_ memos.

Changes

  • Introduce UserProfileService.identifyUser which consults the DB to produce a UserProfile and never traverses (intended for list views which need user metadata)
    • If we don't follow this user or their profile is missing from the DB we fall back to a "shallow" profile display
  • Remove UserProfileService.buildUserProfile
  • Roll buildUserProfile into UserProfileService.loadFullProfileData which always fetches the latest profile from noosphere (intended for the profile detail view)
    • We do not fetch profiles for any users in the following list unless we follow them and have their profile indexed in the DB
  • Consolidate and fix UserProfile.aliases logic
  • EntryStub now comes with a did field

TODO

  • explicitly read from memo for the actual users profile, fall back to DB
  • when listing users if we follow them, check the DB for their profile, fallback to nothing
  • stop passing user profiles around everywhere
  • do a join to list the feed?
  • show AKA blocks on followed users

Uses the same code path as our peers. We were loading the profile for every post we displayed in the feed, this cuts it down to one profile read per sphere revision.
@bfollington bfollington marked this pull request as draft October 5, 2023 03:41
Comment on lines +24 to +30
return NameVariant.known(Slashlink.ourProfile, selfNickname)
case (_, .following(let petname), _, _):
return NameVariant.petname(self.address, petname)
return NameVariant.known(self.address, petname)
case (_, .notFollowing, .some(let selfNickname), _):
return NameVariant.selfNickname(self.address, selfNickname)
return NameVariant.unknown(self.address, selfNickname)
case (_, .notFollowing, _, .some(let proposedName)):
return NameVariant.proposedName(self.address, proposedName)
return NameVariant.unknown(self.address, proposedName)
Copy link
Collaborator Author

@bfollington bfollington Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified the modelling for name display, we either know a user or we don't.

If we know them, we use our name for them falling back to their nickname for themselves. If we don't know them, we use their nickname (if available) or fall back to the leaf of their current address.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the clarity, however, I want to note that this loses us information.

It seems likely at some point we will want to be able to distinguish proposed names from self-proposed names (https://github.com/cwebber/rebooting-the-web-of-trust-spring2018/blob/petnames/draft-documents/petnames.md). For example,

  • We may want to prefer a self-proposed name to a 3p-proposed name when following and suggesting a name.
  • (Speculative) We may wish, in future, to show the user's self-proposed name in context menus or place a badge next to self-proposed names to show user's preferred name.

Copy link
Collaborator Author

@bfollington bfollington Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't actually lose us any information in the model itself, just in the mapping to an (under utilized) view construct. We were rendering proposedName and selfNickname identically before this change so nothing visibly changes as a result of this.

If the kind of cases you point out do arise, we can derive all that we need from a UserProfile regardless.

Comment on lines 121 to 124
struct UserProfileCacheKey: Hashable {
let did: Did
let petname: Petname?
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same user with the same Did can have different UserProfiles depending on the context you encounter them.

A trivial example: following the same Did under two Petnames changes which name is chosen as the "primary" name, depending on which version you're looking at.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth saying: I'm open to the idea of dropping the in-memory cache. The main drawback is that we'd be sending a query to SQLite for each post in the feed except in the case of our profile which is read from our sphere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep it for now. My main question would be how is it invalidated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, instead of hitting the DB for each post, couldn't we work this logic into the SQL query that constructs the feed?

Copy link
Collaborator Author

@bfollington bfollington Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! I was considering that in this comment: #925 (comment)

But I realised today that since EntryStub now has the Did attached, I don't need to load a profile at all for the feed. In fact, I managed to drop author from EntryStub entirely (I wasn't a fan of having to introduce it in the first place).

Now, we can produce the feed + backlinks + transcludes without having to fetch the user's profile in the process. If we introduce something like customisable PFPs in the future we will have to revisit this but I would argue that that's a whole seperate domain & merits a different approach.

As a result, there's really no benefit coming from the in-memory cache so... I removed it. One less invalidation layer to care about.

@bfollington bfollington marked this pull request as ready for review October 10, 2023 04:40
@bfollington bfollington changed the title Cache our profile data Fetch basic user metadata from SQLite if available Oct 10, 2023
Copy link
Collaborator

@gordonbrander gordonbrander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the whole, looks great. Requests for some tests given this touches DB queries, which is error prone.

Comment on lines +24 to +30
return NameVariant.known(Slashlink.ourProfile, selfNickname)
case (_, .following(let petname), _, _):
return NameVariant.petname(self.address, petname)
return NameVariant.known(self.address, petname)
case (_, .notFollowing, .some(let selfNickname), _):
return NameVariant.selfNickname(self.address, selfNickname)
return NameVariant.unknown(self.address, selfNickname)
case (_, .notFollowing, _, .some(let proposedName)):
return NameVariant.proposedName(self.address, proposedName)
return NameVariant.unknown(self.address, proposedName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the clarity, however, I want to note that this loses us information.

It seems likely at some point we will want to be able to distinguish proposed names from self-proposed names (https://github.com/cwebber/rebooting-the-web-of-trust-spring2018/blob/petnames/draft-documents/petnames.md). For example,

  • We may want to prefer a self-proposed name to a 3p-proposed name when following and suggesting a name.
  • (Speculative) We may wish, in future, to show the user's self-proposed name in context menus or place a badge next to self-proposed names to show user's preferred name.

Comment on lines 121 to 124
struct UserProfileCacheKey: Hashable {
let did: Did
let petname: Petname?
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep it for now. My main question would be how is it invalidated?

Comment on lines 121 to 124
struct UserProfileCacheKey: Hashable {
let did: Did
let petname: Petname?
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, instead of hitting the DB for each post, couldn't we work this logic into the SQL query that constructs the feed?

/// we have encountered.
///
/// `context` will be used to determine the preferred navigation address for a user.
func identifyUser(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot going on in this function. Probably deserves a test.

Copy link
Collaborator Author

@bfollington bfollington Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought this would be too difficult to test without traversal, but after some thinking I came up with some reasonable test cases that cover the logic within this function itself.

See Tests_UserProfileService.testIdentifyUserCallSignatures and Tests_UserProfileService.testIdentifyUserRebaseOnContext

Now that we have the DID we actually don't need to attach an author to an EntryStub _at all_.

In the future, if we have customisable PFPs for example, we may need to load author information but I would argue that's better as background metadata we fetch separately from the notes themselves to maximise resilience.
Also, fix the broken query
@bfollington bfollington force-pushed the 2023-10-05-cache-our-profile branch from 3080e5d to 3b7cf40 Compare October 11, 2023 01:09
peer: story.author.address.peer,
slug: story.entry.address.slug
)
pfp: .generated(story.entry.did),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all we were using author for, which can be derived from just the DID.

/// Attempt to read & deserialize a user `_profile_.json` at the given address.
/// Because profile data is optional and we expect it will not always be present
/// any errors are logged & handled and nil will be returned if reading fails.
private func readProfileMemo(
func readProfileMemo(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public so we can access it in tests.

@@ -56,6 +56,7 @@ struct TestUtilities {
path: "database.sqlite",
directoryHint: .notDirectory
)
print("Test SQLite Path: \(databaseURL.path(percentEncoded: false))")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helpful for debugging tests.

Copy link
Collaborator

@gordonbrander gordonbrander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Feel free to land.

Have you play tested? It's late my time, but will playtest myself in the AM.

.toString()?
.toLink()?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's subtle, but

  • A Link is always a Did plus Slug
  • A slashlink may be a Did plus Slug, a Petname plus Slug, or a Slug.
  • A Link is a stable identifier that can be used for indexing
  • A Slashlink is a locator which may be used to resolve a piece of content. It is not a stable identifier.

More background thinking in this commit message: f76a564

The Link is really only used for the database layer to provide an unambiguous identifier.

We store a serialized Link in the id field of the database (not a Slashlink). So we read it back out as a link and then convert it to a slashlink.

@bfollington bfollington merged commit b4b8435 into main Oct 12, 2023
2 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve profile loading performance & offline behaviour
2 participants