Bulk paginated api for backfilling #29

ensary · 2019-09-09T17:18:46Z

Asking for early feedback.
To be updated with tests and possibly changes to queries once I look into performance.

codecov · 2019-09-09T17:22:07Z

Codecov Report

Merging #29 into master will decrease coverage by 1.87%.
The diff coverage is 85%.

@@            Coverage Diff             @@
##           master      #29      +/-   ##
==========================================
- Coverage   88.39%   86.52%   -1.88%     
==========================================
  Files           9        9              
  Lines         431      579     +148     
==========================================
+ Hits          381      501     +120     
- Misses         30       51      +21     
- Partials       20       27       +7

Impacted Files	Coverage Δ
pkg/domain/storage.go	`0% <ø> (ø)`	⬆️
pkg/handlers/v1/cloud_fetch.go	`89.43% <81.7%> (-10.57%)`	⬇️
pkg/storage/db.go	`84.97% <92.1%> (-0.46%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b408c3c...1507f33. Read the comment docs.

main.go

pkg/handlers/v1/cloud_fetch.go

zlozano · 2019-09-09T21:52:47Z

pkg/storage/db.go

@@ -60,6 +60,35 @@ const latestStatusQuery = "WITH latest_candidates AS ( " +
 	"    aws_resources ON " +
 	"        latest.aws_resources_id = aws_resources.id;"

+// This query is used to retrieve all the 'active' resources (i.e. those with assigned IP/Hostname) for specific date
+// TODO - run performance analysis, possibly do something else on SQL level (optimize query or adjust schema)


zlozano · 2019-09-09T21:53:45Z

LGTM so far

gcase555 · 2019-09-10T20:10:15Z

lgtm so far as well

bkrebsbach · 2019-09-11T13:36:07Z

api.yaml

+            }
+          success: '{"status": 200, "bodyPassthrough": true}'
+          error: >
+            {


Suggestion: After playing around with this syntax a bunch, I personally find something like this to be a little easier to read, since it separates the JSON response from the template syntax. There's more duplication here though, so I have no strong feelings either way.

#! if eq .Response.Body.errorType "NotFound" !# { "status": 404 } #! else if eq .Response.Body.errorType "DependencyFailure" !# { "status": 502, "body": { "code": 502, "status": "Bad Gateway", "reason": "#!.Response.Body.errorMessage!#" }} #! else !# { "status": 500, "body": { "code": 500, "status": "Internal Server Error", "reason": "#!.Response.Body.errorMessage!#" }} #! end !#

api.yaml

pkg/handlers/v1/cloud_fetch.go

api.yaml

zlozano · 2019-09-16T19:00:55Z

Approving this in the WIP state. Still would like integration tests.

zlozano · 2019-09-19T13:42:05Z

api.yaml

@@ -138,8 +138,8 @@ paths:
          request: >
            {
              "time": "#!index .Request.Query.time 0!#",
-              "count": #!if .Request.Query.count !# #!index .Request.Query.count 0!# #! else !# 100 #! end !# #!if .Request.Query.type !# ,
-              "type": "#!index .Request.Query.type 0!#" #! end !#
+              "count": #!if .Request.Query.count !# #!index .Request.Query.count 0!# #! else !# 100 #! end !# ,


this is not really a suggestion, mostly posing the question: would this default value be better if embedded as a value inside the app? If in the app, it can be used immediately by anyone who uses this service. If left here in the api descriptor, it must be duplicated by anyone who uses it.

Do we have a good example? It looks like a good idea to me.

Not that I know of. One approach would be to have an attribute inside your handler for a default page size which is populated from an environment variable.

zlozano · 2019-09-19T13:48:29Z

pkg/handlers/v1/cloud_fetch.go

-	Count  *uint `json:"count"`
-	Offset *uint `json:"offset"`
+	Count     uint   `json:"count"`
+	Offset    uint   `json:"offset"`


~~task: AFAICT this isn't used anymore~~ NVM. I see that it's returned by the conversion functions.

pkg/handlers/v1/cloud_fetch.go

zlozano · 2019-09-19T13:56:46Z

pkg/storage/db.go

@@ -546,48 +545,49 @@ func (db *DB) runQuery(ctx context.Context, query string, args ...interface{}) (
 		var timestamp time.Time

 		err = rows.Scan(&row.ARN, &ipAddress, &hostname, &isPublic, &isJoin, &timestamp, &row.AccountID, &row.Region, &row.ResourceType, &metaBytes)
+		if err != nil {


…gging

gcase555

⭐️

kconwayatlassian · 2019-09-19T17:56:36Z

pkg/storage/db.go

+			}
+		}
+		found = false
+		var ipAddresses *[]string


suggestion: I can see that this was already being handled as a pointer to a slice. However, I don't see any reason why this is being handled this way. All usage of the variable after this dereferences the pointer value. I believe this should just be a slice type and not a pointer to a slice type. In general, it's odd to ever see a pointer to a slice.

It's outside this PR (the change is indentation) but I will look into fixing this in the next one.

kconwayatlassian · 2019-09-19T17:59:33Z

pkg/handlers/v1/cloud_fetch.go

+	if err != nil {
+		return "", err
+	}
+	token := base32.StdEncoding.WithPadding(base32.NoPadding).EncodeToString(js)


question: Why base32 instead of base64?

Intentional. Base32 is case insensitive, which makes use in path (or part of URL) safer. Not exactly a big deal in our case, but the thinking was to make this as portable as possible.

kconwayatlassian · 2019-09-19T18:09:35Z

pkg/handlers/v1/cloud_fetch.go

+	Type      string `json:"type"`
+}
+
+func (p *CloudAssetFetchAllByTimestampParameters) toNextPageToken() (string, error) {


nit: Personally, I prefer not having behaviors/methods attached to data containers. My recommendation is generally that 1) interfaces define behaviors but do not contain data, 2) structs contain data but do not implement behaviors, and 3) a struct may implement and interface but then it must not contain data. The phrasing of "contains data" here means that structs attributes are exported and we expect people to use them directly. The goal in that philosophy is to ensure that we don't fall into traditional class patterns where we're trying to model state and ways of manipulating that state. It's not perfect and there are no "hard lines" in practice but I find it useful.

For example, instead of func(p *CloudFetchAllbyTimestampParameters) toNextPageToken() I'd suggest func NextPageToken(p CloudFetchAllByTimestampParameters). Practically speaking, these are both the same. The receiver pointer notation like func(*Something) is intended to replace old C-style struct methods that eventually became class/instance methods. This is really more of a philosophical change where receiver functions operate on the state of the instance they are attached to and the non-receiver style is a pure function that operates on a given input.

Again, they aren't that different in most cases. I just find it more comfortable to avoid any accidental state that might show up in class-style methods.

Excellent explanation, esp the "receiver functions operate on the state of the instance they are attached to and the non-receiver style is a pure function that operates on a given input" part.

kconwayatlassian · 2019-09-19T18:10:47Z

pkg/handlers/v1/cloud_fetch.go

-	return extractOutput(assets), nil
+	nextPageToken, e := input.toNextPageToken()
+	if e != nil {
+		logger.Error(logs.StorageError{Reason: e.Error()})


question: Why do we continue here instead of returning the storage error?

The current page of results is valid, and the case where we can not generate the token from valid parameters is unlikely (if at all possible), but I did not want to ignore the possible error condition that we did not foresee. So the logic currently is - return valid results (that's what client requested), log any issue with generating token for the next page, and let the next request fail.
One clear action item - I should not use StorageError for this, as it has nothing to do with storage, but I am open to failing loudly if we can't generate token for the next page even when we have valid results for the current one.

kconwayatlassian · 2019-09-19T18:12:21Z

pkg/handlers/v1/cloud_fetch.go

 	}
 	if len(assets) == 0 {
-		return CloudAssets{}, NotFound{ID: "any"}
+		return PagedCloudAssets{}, NotFound{ID: "any"}


suggestion: This was not introduced in this PR, but we would usually keep custom error types in the domain package since they are used to communicate across interface boundaries.

what if an error is used in the handler layer only? Should we still put those in the domain package?

kconwayatlassian · 2019-09-19T18:13:57Z

pkg/handlers/v1/cloud_fetch.go

+	}
+
+	//generic error to report to caller to avoid exposing the internal token structure NB, the specific error is still logged
+	tokenError := errors.New("malformed pageToken")


nit: This could be a package constant/variable and re-used rather than redefined on each call.

Great point

first pass on bulk paginated api for backfilling

0832827

ensary requested review from zlozano, kconwayatlassian, gcase555 and sydneyteh96 September 9, 2019 17:18

ensary requested a review from a team as a code owner September 9, 2019 17:18

typos and linter issues

dd1845c

gcase555 reviewed Sep 9, 2019

View reviewed changes

main.go Outdated Show resolved Hide resolved

gcase555 reviewed Sep 9, 2019

View reviewed changes

pkg/handlers/v1/cloud_fetch.go Outdated Show resolved Hide resolved

ensary added 2 commits September 9, 2019 15:50

change naming

80d11e4

still can not get goimports to run properly from IDE (coneofshame)

219dc42

zlozano reviewed Sep 9, 2019

View reviewed changes

pkg/handlers/v1/cloud_fetch.go Outdated Show resolved Hide resolved

zlozano reviewed Sep 9, 2019

View reviewed changes

bkrebsbach reviewed Sep 11, 2019

View reviewed changes

kconwayatlassian reviewed Sep 11, 2019

View reviewed changes

api.yaml Outdated Show resolved Hide resolved

pkg/handlers/v1/cloud_fetch.go Outdated Show resolved Hide resolved

pkg/handlers/v1/cloud_fetch.go Outdated Show resolved Hide resolved

gen.go and tests for handler

21d338e

zlozano reviewed Sep 12, 2019

View reviewed changes

api.yaml Show resolved Hide resolved

add page token and filtering resources by CF type

deccf20

kconwayatlassian previously approved these changes Sep 12, 2019

View reviewed changes

api.yaml Outdated Show resolved Hide resolved

api.yaml Outdated Show resolved Hide resolved

api.yaml Outdated Show resolved Hide resolved

remove enum value as it is not extensible, fix default integer value

c826c0d

ensary dismissed kconwayatlassian’s stale review via c826c0d September 12, 2019 21:50

change the page api so that it does not clash with asset fetching by id

0a9641b

kconwayatlassian previously approved these changes Sep 16, 2019

View reviewed changes

zlozano previously approved these changes Sep 16, 2019

View reviewed changes

Add parameter handling for resource type, fix SQL issue, add paging code

3ab2aec

ensary dismissed zlozano’s stale review via 3ab2aec September 18, 2019 18:32

ensary dismissed kconwayatlassian’s stale review via 3ab2aec September 18, 2019 18:32

ensary added 3 commits September 18, 2019 13:42

linter/style

e25e779

more tests

e4e03a5

linter

f1ce3d6

ensary requested review from gcase555, zlozano, kconwayatlassian and dscgarcia September 18, 2019 21:57

zlozano reviewed Sep 19, 2019

View reviewed changes

pkg/handlers/v1/cloud_fetch.go Outdated Show resolved Hide resolved

more tests

87bbb8a

ensary changed the title ~~WIP: first pass on bulk paginated api for backfilling~~ Bulk paginated api for backfilling Sep 19, 2019

zlozano reviewed Sep 19, 2019

View reviewed changes

ensary added 2 commits September 19, 2019 09:02

stop exposing internal structure for pageToken to caller, remove debu…

d667a34

…gging

formatting

1507f33

ensary requested a review from zlozano September 19, 2019 14:34

zlozano approved these changes Sep 19, 2019

View reviewed changes

gcase555 approved these changes Sep 19, 2019

View reviewed changes

ensary merged commit 11d8fd0 into master Sep 19, 2019

kconwayatlassian reviewed Sep 19, 2019

View reviewed changes

ensary deleted the add-bulk-fetch branch September 9, 2020 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulk paginated api for backfilling #29

Bulk paginated api for backfilling #29

ensary commented Sep 9, 2019

codecov bot commented Sep 9, 2019 •

edited

Loading

zlozano Sep 9, 2019

zlozano commented Sep 9, 2019

gcase555 commented Sep 10, 2019

bkrebsbach Sep 11, 2019

zlozano commented Sep 16, 2019

zlozano Sep 19, 2019

ensary Sep 19, 2019

zlozano Sep 19, 2019

zlozano Sep 19, 2019 •

edited

Loading

zlozano Sep 19, 2019

gcase555 left a comment

kconwayatlassian Sep 19, 2019

ensary Sep 19, 2019

kconwayatlassian Sep 19, 2019

ensary Sep 19, 2019

kconwayatlassian Sep 19, 2019

ensary Sep 19, 2019

kconwayatlassian Sep 19, 2019

ensary Sep 19, 2019

kconwayatlassian Sep 19, 2019

zlozano Sep 19, 2019

kconwayatlassian Sep 19, 2019

ensary Sep 19, 2019

Bulk paginated api for backfilling #29

Bulk paginated api for backfilling #29

Conversation

ensary commented Sep 9, 2019

codecov bot commented Sep 9, 2019 • edited Loading

Codecov Report

Choose a reason for hiding this comment

zlozano commented Sep 9, 2019

gcase555 commented Sep 10, 2019

Choose a reason for hiding this comment

zlozano commented Sep 16, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zlozano Sep 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gcase555 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Sep 9, 2019 •

edited

Loading

zlozano Sep 19, 2019 •

edited

Loading