Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mds-compliance][mds-db] Fix/compliance handles timestamp #316

Merged
merged 17 commits into from
May 15, 2020

Conversation

janedotx
Copy link

Update compliance /count/ endpoint to return relevant policy and to accept a timestamp parameter. Update compliance /snapshot endpoint to search only for policies active at exactly the time specified by the timestamp parameter.

PR Checklist

  • simple searchable title - [mds-db] Add PG env var, [config] Fix eslint config
  • briefly describe the changes in this PR
  • mark as draft if should not be merged
  • write tests for all new functionality

Impacts

  • Provider
  • Agency
  • Audit
  • Policy
  • Compliance
  • Daily
  • Native
  • Policy Author

? { end_date: query_end_date, start_date: query_end_date - days(365) }
: { end_date: now() + days(365), start_date: now() - days(365) }

const timestamp = query_end_date || now()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal: Breaking change (but we have nothing querying for historical compliance and the spec is still in PR so should be okay

We should just change the query_end_date to a query param of timestamp, then we can just do a one-liner and assign a default value on line 81

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support renaming, but I'd rather not assign a default value. Easier to condition on timestamp being null when determining whether to get historical events or read from the cache that way.

if (!AllowedProviderIDs.includes(res.locals.provider_id)) {
return res.status(401).send({ result: 'unauthorized access' })
}

const timestamp = query_end_date || now()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally unnecessary to do this, just don't rename to query_end_date when destructuring on line 163 and instead supply a default value of now()

const { timestamp: query_end_date } = {
...parseRequest(req, { parser: Number }).query('timestamp')
}
//* /

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure what these comments are for

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It renders like normal code in github, I can't tell if the block is commented out or not

@@ -56,6 +56,22 @@ export async function readPolicies(params?: {
return res.rows.map(row => row.policy_json)
}

export async function readActivePolicies(timestamp: Timestamp = now()): Promise<Policy[]> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we can't just reuse the readPolicies method here, but if there's a need to create a new method then the naming seems a little off -- when I think of active policies I think of start_date <= now && end_date >= now && published roughly (with the requirement for some null checking as well). It seems that the start/end restrictions are being met, but it's not considering if the policies are published or not

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is checking for a non-null publish_date on line 79.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but publish_date should take into account the timestamp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was not the case when I made this comment ;) I think the fundamental issue I have with this is that we're using a different method for determining "active policies" between Policy and Compliance. They really ought to be using the same methodology, which is tricky because Policy enables the user to get non-active policies depending on their permissions.

As much as I'd like to do all of the filtering in the DB query, I think what may make sense is to just use the same logic that we have for Policy

const policies = await db.readPolicies({ get_published, get_unpublished })
      const prev_policies: UUID[] = policies.reduce((prev_policies_acc: UUID[], policy: Policy) => {
        if (policy.prev_policies) {
          prev_policies_acc.push(...policy.prev_policies)
        }
        return prev_policies_acc
      }, [])
      const active = policies.filter(p => {
        // overlapping segment logic
        const p_start_date = p.start_date
        const p_end_date = p.end_date || Number.MAX_SAFE_INTEGER
        return end_date >= p_start_date && p_end_date >= start_date && !prev_policies.includes(p.policy_id)
      })

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Indentation got substantially borked, but you get the idea)

Comment on lines 68 to 72
return res.rows
.map(row => row.policy_json)
.filter(policy => {
return policy.end_date === null || policy.end_date >= timestamp
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems super strange to me that we do the start_date check as part of the query, but then do a filter on the result for end_date

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I'm keeping readActivePolicies though, since readPolicies is a more generalized method that allows you to set get_unpublished to true, and it feels weird to have that available as an option when active policies must be published.

Comment on lines 210 to 211
const deviceRecords = await db.readDeviceIds()
const deviceIDs = deviceRecords.map(recorded_device => recorded_device.device_id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do these reads unless we're operating in a 'historical' context

async function fail(err: Error) {
logger.error(err.stack || err)
if (err.message.includes('invalid rule_id')) {
if (err.message.includes('invalid rule_id') || err.message.includes('not found in any policy')) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to avoid adding more things into this error handling pattern if possible

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if we need to do this sort of error handling branching I'd rather be comparing status codes, string enums, or doing instanceof checks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing the error handling feels out of scope. I'd rather keep this PR focused on handling the timestampparam.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, maybe a good candidate for a follow up PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to split that out into a different PR so it'd be easier to read, but I think it's faster to just stick it all in this one.

Comment on lines 186 to 190
const rule = await db.readRule(rule_id)
const policies = await db.findPoliciesByRule(rule_id)
if (policies.length === 0) {
throw new NotFoundError('Rule ID not found in any policy')
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just get rid of the db.readRule method and extract the rule from whatever policy is returned

@@ -267,6 +283,19 @@ export async function readRule(rule_id: UUID): Promise<Rule> {
}
}

export async function findPoliciesByRule(rule_id: UUID): Promise<Policy[]> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this only return a single policy instead of an array, especially if you're throwing an error upon the length != 1. That being said, I still think we could just use readPolicies and just add an extra param for rule_id and the conditional statement

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is only thrown if zero policies are returned. It doesn't say in the spec that rules must have unique UUIDs and there's nothing preventing multiple policies from sharing a rule at the moment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is thrown when res.rowCount !== 1 not res.rowCount === 0. The count endpoint is really only valid under the assumption that rule_ids are unique, even though it's not explicit in the spec (or enforced in the implementation)

Copy link

@macsj200 macsj200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to learn a little bit more about what this is doing before I +1, maybe we can do a quick screenshare.

async function fail(err: Error) {
logger.error(err.stack || err)
if (err.message.includes('invalid rule_id')) {
if (err.message.includes('invalid rule_id') || err.message.includes('not found in any policy')) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if we need to do this sort of error handling branching I'd rather be comparing status codes, string enums, or doing instanceof checks.

const events =
query_end_date && timestamp < now()
? await db.readHistoricalEvents({ end_date: timestamp })
: await cache.readEvents(deviceIDs)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to cache.readAllEvents

const deviceRecords = await db.readDeviceIds()
const deviceIDs = deviceRecords.map(recorded_device => recorded_device.device_id)
const events =
query_end_date && timestamp < now()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get rid of timestamp < now() in both endpoints

}

const { rule_id } = req.params
try {
const rule = await db.readRule(rule_id)
const policies = await db.readActivePolicies(query_date)
const filteredPolicies = policies.filter(policy => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this filtering in TS, we could adapt the SQL query to look for a matching rule_id in the policy json.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ought to also just be destructuring to pull off the head like const [policy] = db.readActivePolicies(...), I see that on line 209 the policy that we send in the response is not necessarily the policy we considered when computing compliance

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ought to also just be destructuring to pull off the head like const [policy] = db.readActivePolicies(...), I see that on line 209 the policy that we send in the response is not necessarily the policy we considered when computing compliance

I don't see how that's possible, now that rule uniqueness is actually enforced.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because filteredPolicies !== policies. We consider the head of the filteredPolicies for the evaluation, but in the response body we are returning policies[0].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the policy that we return in the payload may not be a policy that even has the rule_id in question

packages/mds-db/policies.ts Outdated Show resolved Hide resolved
Comment on lines 139 to 143
const other_policies = await readPolicies({ rule_id: rule.rule_id })
other_policies.map(other_policy => {
if (other_policy.policy_id !== policy.policy_id) {
throw new ConflictError(`Policy containing with a rule of id ${rule.rule_id} already exists`)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically use camelCase as opposed to snake_case for variables, unless they're based off of DB columns or query parameters.

packages/mds-compliance/api.ts Show resolved Hide resolved
@janedotx janedotx merged commit 86d157c into develop May 15, 2020
@janedotx janedotx deleted the fix/compliance-handles-timestamp branch May 15, 2020 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants