Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batchUpdate): enhance batch update functionality #1483

Merged
merged 13 commits into from
Aug 17, 2024

Conversation

kirangodishala
Copy link
Contributor

@kirangodishala kirangodishala commented Jul 19, 2024

This PR adds the following functionality to pipeline batch update operation :

  • refactor bulk storage operation to make it atomic
  • add duplicate pipeline check to /pipelines POST handler method (i.e. save individual pipeline) and  /pipelines/{id} PUT handler method (i.e. update individual pipeline)
  • update /pipelines/batchUpdate POST handler method to address deserialization issues and add some useful log statements
  • adds configurable controls to decide whether cache should be refreshed while checking for duplicate pipelines
controller:
   pipeline:
      save:
         refreshCacheOnDuplicatesCheck: false // default is true 
  • Add pipeline validations to validate all the deserialized pipelines
  • batchUpdate now responds with a status of succeeded and failed pipelines info. The response will be a map containing information in the following format:
[
"successful_pipelines_count"  : <int>,
"successful_pipelines"        : <List<String>>,
"failed_pipelines_count"      : <int>,
"failed_pipelines"            : <List<Map<String, Object>>> 
]
  • add staleCheck to batchUpdate
  • adjust permissions to batchUpdate
    • before: isAdmin,
    • now: verifies application write permission

@kirangodishala kirangodishala marked this pull request as draft July 19, 2024 13:28
@kirangodishala kirangodishala force-pushed the batchUpdate-clean-commits branch from 61c263d to 1c6501b Compare July 19, 2024 17:41
@kirangodishala kirangodishala marked this pull request as ready for review July 19, 2024 17:54
@dbyron-sf
Copy link
Contributor

dbyron-sf commented Jul 19, 2024

For the record, #988 was a previous attempt at something similar. See spinnaker/governance#214 for background.

@kirangodishala kirangodishala force-pushed the batchUpdate-clean-commits branch 3 times, most recently from 7bbdabd to cb00635 Compare July 24, 2024 20:22
dbyron-sf and others added 11 commits July 25, 2024 14:24
to match the one that Front50CoreConfiguration provides.  This paves the way to test
additional PipelineController functionality.
…urations to be used for save/update controller mappings

* Add new configuration class PipelineControllerConfig
* Update Front50WebConfig to use PipelineControllerConfig
* Update PipelineController to use PipelineControllerConfig
* Update PipelineControllerSpec to use PipelineControllerConfig
* Update PipelineControllerTck to use PipelineControllerConfig
* add test to check duplicate pipelines when refreshCacheOnDuplicatesCheck flag is enabled and disabled
* refactor SqlStorageService.storeObjects() method to make the bulk save an atomic operation
* without this change, in case of db exception, some chunks of pipelines get saved while the others fail leading to inconsistency.
* Last Catch block is now removed as it's no longer partial storage of supplied pipelines
* add test for bulk create pipelines which tests the atomic behaviour of the SqlStorageService.storeObjects() method
…or batchUpdate().

checkForDuplicatePipeline() is removed from validatePipeline() and cron trigger validations are moved into validatePipeline() so that reusable code stays at on e place.

remove unused overloaded checkForDuplicatePipeline() method

Fix NPE caused in a test(should create pipelines in a thread safe way) in PipelineControllerSpec due to a newly added log message in PipelineController.save()
…to address deserialization issues and add some useful log statements
…troller.batchUpdate

* Check if user has WRITE permissions on the pipeline, if not the pipeline will be added to invalid pipelines list
* This change is a first step towards controlling access at pipeline level in a batch update. batchUpdate is still allowed only for admins but in the next few commits, the access level will be equated to that of individual pipeline save.
* Check if duplicate pipeline exists in the same app
* Validate pipeline id
* Adjust test classes for PipelineController changes
…failed pipelines and their counts

* The response will be in the following format:
[
"successful_pipelines_count" : <int>,
"successful_pipelines"        : <List<String>>,
"failed_pipelines_count"      : <int>,
"failed_pipelines"            : <List<Map<String, Object>>>
]
…ine in the batch already exists and their lastModified timestamps don't match then the pipeline is stale and hence added to invalid pipelines list. This behaviour is same as that of individual save and update operations.

* add test to validate the code around staleCheck for batchUpdate
* adjust permissions to batchUpdate (before: isAdmin, now: verifies application write permission).

* enforce runAsUser permissions while deserializing pipelines

* This puts batchUpdate on a par with individual save w.r.t. access restrictions

* adjust test classes according to the changes to the PipelineController
Fixed test exceptions by making the following changes:
- added @EqualsAndHashCode to Pipeline
- added `pipelineDAO.all(true)` in SqlPipelineControllerTck.createPipelineDAO() to initialize the cache with empty set. Otherwise, the tests fail due to NPE.
@kirangodishala kirangodishala force-pushed the batchUpdate-clean-commits branch from cb00635 to 894299a Compare July 25, 2024 14:17
pipelinesToSave.size(),
System.currentTimeMillis() - bulkImportStartTime);

List<String> savedPipelines =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the failed pipelines, you're storing more info about each pipeline, eg application, id, etc. does it make sense to add some of that information to the saved pipelines as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where we store error information in failed pipelines, but not other stuff. Can you provide a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole failed pipeline(as a map) gets added to the list of failed pipelines in response. Here and here is the code for that. AdditionallyerrorMsg is put into the failed pipeline map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@link108 - Given hundreds/thousands of pipelines, the idea I think is to let the user know more about failed pipelines and keep the info about succeeded ones to minimum. But if adding a couple of fields (like id, application) provides more value, we can do that.

}
}
}
}
} catch (e: Exception) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this happens much - but any reason to lose this try/catch here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking TransientDaoException case where DB is failing or similar...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHORT answer it'd be nice if these were caught & re-raised as Spinnaker exception objects

try {
withPool(poolName) {
jooq.transactional(sqlRetryProperties.transactions) { ctx ->
withPool(poolName) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this flip, I think the entire batch fails now vs. only a chunk of it - intentional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess that's partly the point of this PR - just wondering if that's a good thing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ignore this mostly just curious philosophically which is better :)


return pipelineDAO.create(pipeline.getId(), pipeline);
Pipeline savedPipeline = pipelineDAO.create(pipeline.getId(), pipeline);
log.info(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit: probably should be debug level vs. info level.

log.debug(
"Successfully validated pipeline {} in {}ms",
pipeline.getName(),
System.currentTimeMillis() - validationStartTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe validation time would be better as a metric? A gauge OR counter with tags by app, or pipeline stages or such? NOT required, just a thought.

List<Pipeline> pipelines = deserializedPipelines.getLeft();
List<Map<String, Object>> failedPipelines = deserializedPipelines.getRight();

log.info(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution on "info" logs. IF this isn't used often ok - not sure that batchUpdates like this are regularly done but would prefer debug level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log occurs once for the entire batch operation so I am hoping it comes handy while monitoring.

if (staleCheck
&& !Strings.isNullOrEmpty(pipeline.getId())
&& pipeline.getLastModified() != null) {
checkForStalePipeline(pipeline, errors);
}

// Run other pre-configured validators
pipelineValidators.forEach(it -> it.validate(pipeline, errors));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly a LOT of the above operations LIKE the cron ids could be moved to pipelineValidators & shrink a lot of this code down :) and simplify it! Future enhancement maybe...

@kirangodishala
Copy link
Contributor Author

kirangodishala commented Aug 5, 2024

@link108 @jasonmcintosh - could you approve the PR?

@dbyron-sf dbyron-sf added the ready to merge Approved and ready for merge label Aug 17, 2024
@mergify mergify bot added the auto merged label Aug 17, 2024
@mergify mergify bot merged commit 5f7a4d7 into spinnaker:master Aug 17, 2024
5 checks passed
richard-timpson pushed a commit to richard-timpson/spinnaker.io that referenced this pull request Aug 23, 2024
…e functionality

The documentation for front50 already existed. Adds some sections for the gate and orca changes which make the functionality feature complete.

See relevant PRs
spinnaker/front50#1483, spinnaker/orca#4773, and spinnaker/gate#1823
dbyron-sf pushed a commit to spinnaker/spinnaker.io that referenced this pull request Aug 23, 2024
…e functionality (#458)

The documentation for front50 already existed. Adds some sections for the gate and orca changes which make the functionality feature complete.

See relevant PRs
spinnaker/front50#1483, spinnaker/orca#4773, and spinnaker/gate#1823

Co-authored-by: Richard Timpson <richard.timpson@salesforce.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants