refactor: Simplify fetcher interface #1746

AndrewSisley · 2023-08-02T11:36:36Z

Relevant issue(s)

Resolves #1589

Description

Simplifies the fetcher interface, removing 2 of the 3 FetchFoo functions. As well as making it simpler, this also means that all (one) of the functions have to behave in the same way (e.g. deleted doc support).

There is a fair amount going on here, suggest reviewing commit by commit - it should help explain what is going on and why.

codecov · 2023-08-02T11:47:08Z

Codecov Report

Patch coverage: 74.48% and project coverage change: +0.25% 🎉

Comparison is base (0647f97) 75.42% compared to head (8c0b6eb) 75.68%.

@@             Coverage Diff             @@
##           develop    #1746      +/-   ##
===========================================
+ Coverage    75.42%   75.68%   +0.25%     
===========================================
  Files          208      208              
  Lines        21801    21735      -66     
===========================================
+ Hits         16443    16448       +5     
+ Misses        4210     4142      -68     
+ Partials      1148     1145       -3

Flag	Coverage Δ
all-tests	`75.68% <74.48%> (+0.25%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
db/collection_get.go	`72.73% <25.00%> (-8.52%)`	⬇️
db/collection_index.go	`96.64% <50.00%> (-0.82%)`	⬇️
planner/scan.go	`88.50% <50.00%> (-1.40%)`	⬇️
lens/fetcher.go	`69.65% <68.18%> (+19.28%)`	⬆️
db/fetcher/encoded_doc.go	`75.68% <82.50%> (+10.05%)`	⬆️
db/fetcher/fetcher.go	`76.31% <90.24%> (+0.95%)`	⬆️

... and 4 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0647f97...8c0b6eb. Read the comment docs.

fredcarle

LGTM. Just one minor todo.

I like the change and the simplification it brings.

fredcarle · 2023-08-02T15:00:32Z

db/fetcher/encoded_doc.go

+	schemaVersionID      string
+	status               client.DocumentStatus
+	properties           map[client.FieldDescription]*encProperty
+	decodedpropertyCache map[client.FieldDescription]any


todo: change to decodePropertyCache

Why? decode is a verb, and if there are values in here, this means it has been decoded (past-tense) (and I think the past-tense becomes an adjective, which the present tense cannot be)

sorry I did a typo in my change request: decodedPropertyCache. I just want it to be camelcased properly.

Ah 😆 I didnt spot that at all - thanks Fred, will change of course

Fix casing

jsimnz

Some questions and suggestions and one todo.

One notable question regarding the interface change was if we still want it to match the Enumerable interface?

Will run thru this again (went thru it all, but want to digest, and might have some other thoughts depending on the answer/approach to the above question)

But overall, very good!

jsimnz · 2023-08-03T00:54:10Z

db/fetcher/encoded_doc.go

+		// only get filter fields if filter=true
+		if onlyFilterProps && !prop.IsFilter {


todo: (This might be a problem from the original implementation) But I think the onlyFilterProps should precede the cached value check/assignment?

I can see what you mean, but I thought this check was purely a performance thing from within this function - returning extra stuff would not be an issue, and if it is cached then the performance cost should be negligible?

Although, having written that out it still seems technically incorrect to return the value, even if there is no downside RE user visible behaviour (incl. perf.), I'll move it and see if anything breaks 😁

Move

jsimnz · 2023-08-03T00:57:14Z

db/fetcher/fetcher.go

@@ -514,14 +522,49 @@ func (df *DocumentFetcher) processKV(kv *keyValue) error {

 	df.execInfo.FieldsFetched++

-	df.doc.Properties[fieldDesc] = property
+	df.doc.properties[fieldDesc] = property

 	return nil
 }

 // FetchNext returns a raw binary encoded document. It iterates over all the relevant
 // keypairs from the underlying store and constructs the document.
 func (df *DocumentFetcher) FetchNext(ctx context.Context) (EncodedDocument, ExecInfo, error) {


question: Wasn't part of this refactor to make it also match the Enumerable interface? If so, I think it would be better to do this under a single PR rather than split, since its all very related.

If this is the case, we can likely move the ExecInfo off the main Fetch method (it feels a little weird here anyway) and have a dedicated method to get the info instead.

Wasn't part of this refactor to make it also match the Enumerable interface?

We for sure spoke a bit about that, and I started to look into it, but it felt slightly forced (due to a slight mismatch of funcs like Close and Reset, Init, Start, etc) , and had much less value IMO. This PR currently makes things much better, and cheaper for myself and Islam to build upon, changing to Enumerable would not speed up the Lens work, and I doubt it would help Islam.

We can look at making it Enumerable later I think, for now it is effort I would rather not spend, and would likely have very little impact on our productivity in the medium term.

we can likely move the ExecInfo off the main Fetch method (it feels a little weird here anyway) and have a dedicated method to get the info instead.

I also thought about doing that too, but it is not a new problem, and is pretty common to have stats like this outputted from funcs like this as return values. It is also extra effort now that does not benefit us anytime soon as far as I can see, and would only slow down the merge of changes that do benefit us right now.

We for sure spoke a bit about that, and I started to look into it, but it felt slightly forced (due to a slight mismatch of funcs like Close and Reset, Init, Start, etc)

OK no problem. Just remembered talking about it and wanted to make sure it was addressed (regardless of in favor or against).

I also thought about doing that too, but it is not a new problem, and is pretty common to have stats like this outputted from funcs like this

Correct, this isnt a new problem, and not really all that important. The comment was only in relation to if we moved forward with the enumerable interface matching.

Just remembered talking about it and wanted to make sure it was addressed

Yeah, I should probably have noted this in the PR description - I nearly did and then changed my mind as it wasnt noted in the related issue.

The comment was only in relation to if we moved forward with the enumerable interface matching.

Does sound like the cleanest way to handle it with an enumerable-derived interface :)

jsimnz · 2023-08-03T01:01:48Z

db/fetcher_test.go

@@ -183,162 +182,3 @@ func TestFetcherGetAllPrimaryIndexEncodedDocMultiple(t *testing.T) {
 	assert.NoError(t, err)
 	assert.NotNil(t, encdoc)
 }
-


just want to raise (my usual ;) ) objection to removing some of these unit tests.

The function no longer exists, there is nothing to unit test. FetchNext is covered above.

Theres two notable things here in these removed tests that arent covered in the tests that remain. 1) Decoding 2) Value checks. Both of these have value and can be either integrated in to the remaining tests, or we can keep the deleted ones and just update them to use the new APIs

Decoding

If we want unit test for decoding I think they should look quite different to these, and be far more comprehensive - the ones deleted only ever test a single string value, and one low, positive, integer. I don't think any time would really be saved by converting these tests to decoding unit tests.

Value checks

That's weird, but you are right, the other tests never actually check the result of FetchNext, only that it doesn't error 🤷 - I still very much don't think they are worth keeping though, given that they are offering very little user protection vs our integration tests

And handle within the fetcher instance that fetches it. This means the status is now available via fetcher.FetchNext, an important step towards making the 3 fetcher functions equal.

Grants deleted doc support to all three FetchFoo funcs as the two others call FetchNext. All 3 funcs should now behave the same way, just outputing the result in a slightly different format.

As well as being inconsistent it is blockking the addition of a Properties() func

This allows a minor refactor, and provides access to the properties of the EncodedDoc from outside of the interfaced object.

Simplifies both the EncodedDoc interface, and the internals of the implementation.

FetchNextDoc and FetchNextDecoded now can just call FetchNext.

jsimnz

Approving now. All looks good. Would still in theory like the see some aspect of the tests preserved, but ill leave that discretion to you

## Relevant issue(s) Resolves sourcenetwork#1589 ## Description Simplifies the fetcher interface, removing 2 of the 3 FetchFoo functions. As well as making it simpler, this also means that all (one) of the functions have to behave in the same way (e.g. deleted doc support).

AndrewSisley added area/query Related to the query component refactor This issue specific to or requires *notable* refactoring of existing codebases and components action/no-benchmark Skips the action that runs the benchmark. labels Aug 2, 2023

AndrewSisley added this to the DefraDB v0.7 milestone Aug 2, 2023

AndrewSisley requested a review from a team August 2, 2023 11:36

AndrewSisley self-assigned this Aug 2, 2023

AndrewSisley force-pushed the 1589-fetcher-interface branch from c1679c0 to 284b1b5 Compare August 2, 2023 11:37

fredcarle approved these changes Aug 2, 2023

View reviewed changes

jsimnz requested changes Aug 3, 2023

View reviewed changes

AndrewSisley requested a review from jsimnz August 3, 2023 11:10

AndrewSisley force-pushed the 1589-fetcher-interface branch from 284b1b5 to 407c867 Compare August 3, 2023 11:12

AndrewSisley added 12 commits August 3, 2023 09:30

Host doc status on encoded doc

d853a7a

And handle within the fetcher instance that fetches it. This means the status is now available via fetcher.FetchNext, an important step towards making the 3 fetcher functions equal.

Move deleted doc fetch logic into fetcher.FetchNext

b1551d7

Grants deleted doc support to all three FetchFoo funcs as the two others call FetchNext. All 3 funcs should now behave the same way, just outputing the result in a slightly different format.

Make encodedDoc.properties private

ba18326

As well as being inconsistent it is blockking the addition of a Properties() func

Add encodedDoc.Properties getter

6606475

This allows a minor refactor, and provides access to the properties of the EncodedDoc from outside of the interfaced object.

Remove DecodeToDoc from EncodedDoc interface

f0efe79

Simplifies both the EncodedDoc interface, and the internals of the implementation.

Remove Decode from EncodedDoc interface

32b2aba

Refactor lens fetcher

5cb9aec

FetchNextDoc and FetchNextDecoded now can just call FetchNext.

Remove unused scan node key prop

d0a82db

Remove fetcher.FetchNextDoc

1195e4a

Remove fetcher.FetchNextDecoded

9a8fed0

PR FIXUP - Move filter-properties check

8807f85

PR FIXUP - Fix decodedPropertyCache typo

8c0b6eb

AndrewSisley force-pushed the 1589-fetcher-interface branch from b2f9ab0 to 8c0b6eb Compare August 3, 2023 13:30

jsimnz approved these changes Aug 3, 2023

View reviewed changes

AndrewSisley merged commit 07380d8 into sourcenetwork:develop Aug 3, 2023

AndrewSisley deleted the 1589-fetcher-interface branch August 3, 2023 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Simplify fetcher interface #1746

refactor: Simplify fetcher interface #1746

AndrewSisley commented Aug 2, 2023

codecov bot commented Aug 2, 2023 •

edited

Loading

fredcarle left a comment

fredcarle Aug 2, 2023

AndrewSisley Aug 3, 2023

fredcarle Aug 3, 2023

AndrewSisley Aug 3, 2023 •

edited

Loading

jsimnz left a comment

jsimnz Aug 3, 2023

AndrewSisley Aug 3, 2023 •

edited

Loading

jsimnz Aug 3, 2023

jsimnz Aug 3, 2023

AndrewSisley Aug 3, 2023

jsimnz Aug 3, 2023

AndrewSisley Aug 3, 2023

jsimnz Aug 3, 2023

AndrewSisley Aug 3, 2023

jsimnz Aug 3, 2023

AndrewSisley Aug 3, 2023

jsimnz left a comment

		// only get filter fields if filter=true
		if onlyFilterProps && !prop.IsFilter {

refactor: Simplify fetcher interface #1746

refactor: Simplify fetcher interface #1746

Conversation

AndrewSisley commented Aug 2, 2023

Relevant issue(s)

Description

codecov bot commented Aug 2, 2023 • edited Loading

Codecov Report

fredcarle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Aug 3, 2023 • edited Loading

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley Aug 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsimnz left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 2, 2023 •

edited

Loading

AndrewSisley Aug 3, 2023 •

edited

Loading

AndrewSisley Aug 3, 2023 •

edited

Loading