Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBG-3721: [Diagnostic API] Import filter and sync function dry run endpoints #6715

Merged
merged 28 commits into from
Apr 9, 2024

Conversation

mohammed-madi
Copy link
Contributor

@mohammed-madi mohammed-madi commented Mar 5, 2024

CBG-3721

Describe your PR here...

  • Added 2 endpoints to the diagnostic API, sync function and import filter dry runs
  • Added API docs
  • Added TestGetDocDryRuns

Pre-review checklist

  • Removed debug logging (fmt.Print, log.Print, ...)
  • Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

  • Link upstream PRs
  • Update Go module dependencies when merged

Integration Tests

Copy link

github-actions bot commented Mar 5, 2024

Redocly previews

@@ -373,6 +373,9 @@ func createDiagnosticRouter(sc *ServerContext) *mux.Router {
keyspace := r.PathPrefix("/{keyspace:" + dbRegex + "}/").Subrouter()
keyspace.StrictSlash(true)
keyspace.Handle("/{docid:"+docRegex+"}/_all_channels", makeHandler(sc, adminPrivs, []Permission{PermReadAppData}, nil, (*handler).handleGetDocChannels)).Methods("GET")
keyspace.Handle("/_sync", makeHandler(sc, adminPrivs, []Permission{PermReadAppData}, nil, (*handler).handleSyncFnDryRun)).Methods("GET")
keyspace.Handle("/_import_filter", makeHandler(sc, adminPrivs, []Permission{PermReadAppData}, nil, (*handler).handleImportFilterDryRun)).Methods("GET")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back to the spec (after looking at your changes from the last review), and it looks like this endpoint is supposed to support either a docID on the path, or providing the body directly. I think that still makes sense - did you decide to change that intentionally?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same applies to _sync - the spec looks like we want to support either docID or a provided body. In this case it might be the spec that needs to be updated, but what do you think the ideal behaviour is for this one?

Copy link
Contributor Author

@mohammed-madi mohammed-madi Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to update the spec once this is merged but if the body is provided I dont think theres a need for a doc id, and the docid can be an optional query for sync functions that use oldDoc

docid := h.getQuery("docid")
revID := h.getQuery("rev")

body, err := h.readDocument()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required to specify a body for this call? If yes, should we return error if it's empty? If no, does line 64 succeed if not present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the body is empty and no doc id is not provided it will error, if the body is not provided but a doc id is provided, the doc in the bucket will be used as the body ( will return an error if the doc isnt there).
Ive added test cases for both scenarios

ID: docID,
_body: body,
}
oldDoc := doc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like if a body is passed in and docID is not found, we're using the body as both doc and oldDoc. I don't think this is what users will want to do - if they are only passing body, I think it should behave as an insert (oldDoc=null)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its treated as an insert if there is no docID, but olddoc is used in prepareSyncFn. If doc id is not found it returns 404

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an insert, we need to ensure we're not populating oldDoc body in the sync function (people trying to test sync functions that have different handling for insert/update aren't going to work). If oldDoc needs to be non-nil, I expect you at least need to set oldDoc._body=nil? Or is this handled in another way that I'm overlooking?

db/crud.go Outdated
return nil, nil, nil, nil, base.HTTPErrorf(http.StatusBadRequest, "Invalid revision ID")
}
generation++
oldDoc.RevID = body.ExtractRev()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What body is this extracting the rev from? It looks like it's the body that was passed in - I expected it to be docInBucket.SyncData.CurrentRev. Do we have unit tests that validate that rev is being properly populated into doc and oldDoc when doing a dry run of the sync function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no doc id is given, the rev id is not needed anywhere. It's only used for getting the parent rev if a doc id is given. I verified it with debug logging and I would probably have to return the rev id in the dry run response if we wanted to write unit tests to check it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was whether oldDoc._rev is being set for use by the sync function. Looking at TestSyncFnOldDocBodyProperties it seems like we don't populate it in any case.

It's still be incorrect to set oldDoc.RevID to the value specified in body though (which is doc.rev, not oldDoc.rev). Does something break if you omit setting this property, if you don't think it's used?

ID: docID,
_body: body,
}
oldDoc := doc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an insert, we need to ensure we're not populating oldDoc body in the sync function (people trying to test sync functions that have different handling for insert/update aren't going to work). If oldDoc needs to be non-nil, I expect you at least need to set oldDoc._body=nil? Or is this handled in another way that I'm overlooking?

db/crud.go Outdated
return nil, nil, nil, nil, base.HTTPErrorf(http.StatusBadRequest, "Invalid revision ID")
}
generation++
oldDoc.RevID = body.ExtractRev()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was whether oldDoc._rev is being set for use by the sync function. Looking at TestSyncFnOldDocBodyProperties it seems like we don't populate it in any case.

It's still be incorrect to set oldDoc.RevID to the value specified in body though (which is doc.rev, not oldDoc.rev). Does something break if you omit setting this property, if you don't think it's used?

Copy link
Collaborator

@adamcfraser adamcfraser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good, except for one question about whether setting both doc and oldDoc is intentional when only docID is specified.

assert.Equal(t, respMap.Access, channels.AccessMap{"user1": channels.BaseSetOf(t, "channel2")})

// get doc from bucket with no body provided
response = rt.SendDiagnosticRequest("GET", fmt.Sprintf("/{{.keyspace}}/_sync?doc_id=doc"), ``)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is using the document from the bucket as both doc and oldDoc in the sync function. Is this intentional? Using it as doc only seems like a more natural use case to me, but I'm open to other points of view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we discussed allowing this in sync up, either we use it as doc only or we attempt to get the previous rev as oldDoc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive made the change to use it as doc only, let me know if we need to pull the old rev as old doc

@adamcfraser adamcfraser merged commit dd9430c into main Apr 9, 2024
33 checks passed
@adamcfraser adamcfraser deleted the CBG-3721 branch April 9, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants