-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RN-400: Meditrak permissions based sync (server-side changes) #3992
Conversation
* } | ||
* Responds to GET requests to the /changes/metadata endpoint | ||
*/ | ||
export async function changesMetadata(req, res) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this endpoint returns more/different data to the /changes/count
endpoint, I decided to create a new one with a new name. Also makes maintaining support for the old logic a little easier.
}); | ||
const changesCount = await models.meditrakSyncQueue.count(filter); | ||
|
||
const { query: dbQuery } = await getChangesFilter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-wrote this to just build up an actual SQL query rather than a query object. I find once the queries get beyond a certain level of complexity then it's just much easier to work with direct SQL. Still feels a bit odd passing in the select statement here though so I understand if reviewer wants to push back.
const changes = await query.executeOnDatabase(database); | ||
const changesByRecordType = groupBy(changes, 'record_type'); | ||
const recordTypesToSync = Object.keys(changesByRecordType); | ||
const columnNamesByRecordType = Object.fromEntries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewrote this to do all the database stuff upfront in batch. Made server-side processing of changes much faster. Now the bottleneck is on the app side write to database.
export const supportsPermissionsBasedSync = version => | ||
semverCompare(version, PERMISSIONS_BASED_SYNC_MIN_APP_VERSION) >= 0; | ||
|
||
export const getCountriesAndPermissionsToSync = async req => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where we do the little shimmy to work out which new countries and permission groups we need to sync. Will add more comments around this code explaining why
* This is used to improve the speed of querying the meditrak_sync_queue when | ||
* doing a permissions based sync | ||
*/ | ||
const query = new SqlQuery(` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big ol' SQL query yay! Lemme know if you'd prefer me to abstract this into something more high level? I don't mind it as it stands cos it's more readable than it would be otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaah 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's really straightforward once you know what each line does and see how much repetition there is. I'd be keen to try it with some higher level functions pulled out, might give it a go while you're away if I get a chance. No need to though, agree this is pretty readable once you overcome the "shit that's dense" vibe
packages/central-server/src/index.js
Outdated
*/ | ||
if (isFeatureEnabled('MEDITRAK_SYNC_QUEUE')) { | ||
const meditrakSyncQueue = new MeditrakSyncQueue(models); | ||
await meditrakSyncQueue.createPermissionsBasedView(); // Ensure permissions based view has been setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always re-create on server startup? Simple: yes, efficient: maybe no... especially since this adds about 30 seconds to the server start time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But good call to go simple to start with, and optimise later if we need to
exports.up = function (db) { | ||
return db.runSql(` | ||
DROP AGGREGATE IF EXISTS array_concat_agg(anyarray); | ||
CREATE AGGREGATE array_concat_agg(anyarray) ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom aggregate function to help with the permissions based meditrak sync queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool 😎
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some great work, thanks heaps for this @rohan-bes! Our users are going to absolutely love it, and it fixes a real point of embarrassment during initial sync (i.e. peoples first impression of meditrak!)
I think you've gone for a good compromise in the design decision to not bother looking the country x permission group combination, but instead just going for the simpler but mostly-just-as-good check for if the user has the permission group in any country, and any permission group in the country.
exports.up = function (db) { | ||
return db.runSql(` | ||
DROP AGGREGATE IF EXISTS array_concat_agg(anyarray); | ||
CREATE AGGREGATE array_concat_agg(anyarray) ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool 😎
const changes = await database.find(TYPES.MEDITRAK_SYNC_QUEUE, filter, { | ||
sort: ['change_time'], | ||
const msqColumns = await models.meditrakSyncQueue.fetchFieldNames(); | ||
const changeFilterFunction = supportsPermissionsBasedSync(appVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of a good verb to replace this with?
I don't always mind a function being named with a noun, but because "change" can also be a verb, I find this one particularly confusing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might do a bit of renaming around this area. I now recognise that the nature of this function has actually changed (it used to just build up a filter object, but now it builds a query to fetch the changes).
But I'll take note to ensure whatever happens this line reads more intuitively 👍
? getPermissionsBasedChangesFilter | ||
: getChangesFilter; | ||
const { query } = await changeFilterFunction(req, { | ||
select: msqColumns.join(', '), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select: msqColumns.join(', '), | |
select: await models.meditrakSyncQueue.fetchFieldNames(), |
Rather than a named variable here I find it easier to understand what's happening if it's all on one line
* This is used to improve the speed of querying the meditrak_sync_queue when | ||
* doing a permissions based sync | ||
*/ | ||
const query = new SqlQuery(` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aaaah 😱
JOIN survey s ON s.id = ss.survey_id | ||
WHERE s.code = ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused here...why can't we use
JOIN survey s ON s.id = ss.survey_id | |
WHERE s.code = ? | |
WHERE ss.survey_id = ? |
And provide this.id? Is this sometimes run before the survey has been committed to the db and therefore may not yet have an id generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch! I'll fix this up
if (sort) { | ||
query = query.concat(` | ||
ORDER BY ${sort} | ||
`); | ||
} | ||
|
||
if (limit !== undefined) { | ||
query = query.concat(` | ||
LIMIT ? | ||
`); | ||
params.push(limit); | ||
} | ||
|
||
if (offset !== undefined) { | ||
query = query.concat(` | ||
OFFSET ? | ||
`); | ||
params.push(offset); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of this stuff could be pulled out into a common DRY function above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will do 👍
const typesWithoutPermissions = ['country', 'permission_group']; // Sync all countries and permission groups (needed for requesting access) | ||
const permissionsClauses = [ | ||
{ | ||
query: `"type" = ? AND record_type IN ${SqlQuery.record(typesWithoutPermissions)}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to separate these into two separate clauses, as it confused me initially thinking that we were only excluding deletes for the record types without permissions. I understand that is hard with the OR
s between everything, so maybe we set up an initial query string outside of this first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can come up with something to address this 👍
`REFRESH MATERIALIZED VIEW CONCURRENTLY permissions_based_meditrak_sync_queue;`, | ||
); | ||
const end = Date.now(); | ||
winston.info(`permissions_based_meditrak_sync_queue refresh took: ${end - start}ms`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this generally take, out of interest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check this, but I think I remember it taking a few seconds. Since it's a standard materialized view it basically has to fully rebuild each time, but the underlying query doesn't take too long.
If it were a fast refresh materialized view however... 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha nooo don't do it!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so it takes 30 seconds to refresh, my bad. Makes sense as that's how long it takes to build the table initially. However, it's a concurrent refresh, which means the table can be safely queried while it's refreshing
...questionChanges, | ||
...optionSetChanges, | ||
...optionChanges, | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to run some stress testing on this - does it perform ok if e.g. all of the ssc's are being changed anyway (because they're deleted and created again) so they're also being added to the sync queue through the normal process? We don't hit too much locking if you do that with a bunch of surveys in a row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I had sort of assumed that this change handler would run after all the associated changes had resolved due to the debounce time, but it's definitely possible that it could kick off halfway which might make things awkward. I'll make a note in the test plan 👍
packages/central-server/src/index.js
Outdated
*/ | ||
if (isFeatureEnabled('MEDITRAK_SYNC_QUEUE')) { | ||
const meditrakSyncQueue = new MeditrakSyncQueue(models); | ||
await meditrakSyncQueue.createPermissionsBasedView(); // Ensure permissions based view has been setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But good call to go simple to start with, and optimise later if we need to
Oh and one more general comment: I think this could do with more inline descriptions throughout. You've articulated the strategy really clearly in the PR description, and at a minimum I'd love to see that just copied into the top of one of the files. |
adf82df
to
2e35a56
Compare
const query = new SqlQuery(` | ||
DROP MATERIALIZED VIEW IF EXISTS permissions_based_meditrak_sync_queue; | ||
CREATE MATERIALIZED VIEW permissions_based_meditrak_sync_queue AS | ||
SELECT msq.*, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edmofro bit of a change here in the latest commit. Before we had a separate field for each record type permissions, eg. (survey_countries, survey_permission_groups, survey_screen_countries, etc.)
This was pretty ugly and tedious, so I'm merging them into a single value using COALESCE here. Since there's just a single value per record type I think it should all be fine, and makes things a lot more readable!
meditrak_sync_queue and permissions_based_meditrak_sync_queue
…missions to - Introduced /changes/metadata route to let app know which countries and permissions groups are being synced - Rewrote getChangesFilter to build an SqlQuery rather than using our ORM
- buildMeditrakSyncQuery and buildPermissionsBasedMeditrakSyncQuery - This is closer to what it's new logic is
- Reworked permissions_based_meditrak_sync_queue to just have single 'coutry_ids' and 'permission_groups' columns - Added comments explaining how permission based sync works
5eac074
to
a9979fe
Compare
SFUNC = array_cat, | ||
STYPE = anyarray | ||
); | ||
`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edmofro there's actually a little start up problem when trying to deploy this migration.
In order to run this migration, the MeditrakSyncQueue
must be running.
In order to create the MeditrakSyncQueue
the database must have the array_concat_agg
function.
I guess this highlights a genuine dependency cycle in our code logic, however I feel it'll be acceptable for us to just manually install this function in the prod database prior to releasing this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I'm fine with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looking great!
- countChanges wasn't using permissions based sync, so thought there were 0 changes
- Fixed bug where not all changes where synced when user permissions changed
…t route - Permissions based sync uses changes/metadata instead
02fb68b
to
56df105
Compare
…to deployment logic - Adding 30 seconds to the central-server start time was proving too annoying
Issue RN-400:
Server side changes for the new meditrak permissions based sync.
Sync logic:
This algorithm is hopefully sufficiently simple and flexible (can handle new user logins, as well as user permission changes) without syncing too much data, especially during the initial sync.
In order to filter the changes to sync by the countries and permission groups, I've created a new materialized view (
permissions_based_meditrak_sync_queue
) which theMeditrakSyncQueue
manages.I've tried to break this PR up into clean separate commits, if that's easier for the reviewer.