-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DATA-611] Stop data capturing when turned off #1510
[DATA-611] Stop data capturing when turned off #1510
Conversation
@@ -502,6 +526,7 @@ func (svc *builtIn) Update(ctx context.Context, cfg *config.Config) error { | |||
if err := svc.initOrUpdateSyncer(ctx, 0, cfg); err != nil { | |||
return err | |||
} | |||
// so if toggled sync off, then I want to STOP existing collectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily - these should be independent. You might still want to capture data locally, just stop syncing it. Collectors are only in charge of collecting the data
// if disabled, make sure that it is closed, so it doesn't keep collecting data. | ||
collector, md, err := svc.getCollectorFromConfig(attributes) | ||
if err != nil { | ||
svc.logger.Errorw("collector ", attributes.Name, " was not found", "error", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessarily an error? Or would it occur with any component that hasn't and doesn't have data capture turned on? If the latter (or some other valid case), it should probably be an INFO level log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense! changing to info
componentMetadata, err := svc.initializeOrUpdateCollector( | ||
attributes, updateCaptureDir) | ||
if err != nil { | ||
svc.logger.Errorw("failed to initialize or update collector", "error", err) | ||
} else { | ||
newCollectorMetadata[*componentMetadata] = true | ||
} | ||
} else if attributes.Disabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this else if, or could it just be else? Would we not want to do this if say !attributes.Disabled && svc.captureDisabled
, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do need it to be else if, because if svc.captureDisabled
, either no collectors existed in the past, or toggledCaptureOff
would be true and all collectors would be closed on line 500. From there, no other collectors should have been initialized so as long as the first part of the if is true, then there wouldn't be more collectors to worry about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that makes sense, this looks good then. Definitely a bit fragile that the logic here depends on these implicit connections over a pretty wide portion of code, which I think contributed to the bug happening. Just making a note that this is probably something we should try to simplify in the near future
@@ -134,6 +134,8 @@ func newTestDataManager(t *testing.T, localArmKey, remoteArmKey string) internal | |||
} | |||
} | |||
|
|||
// this arm should also have a collectors since it's using the data manager service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this comment referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops deleted, just self referencing
func TestDataCapture(t *testing.T) { | ||
tests := []struct { | ||
name string | ||
initialDisableStatus bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: disabled x2
@@ -59,6 +59,7 @@ type collector struct { | |||
cancelCtx context.Context | |||
cancel context.CancelFunc | |||
capturer Capturer | |||
closed bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this bool needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being used in the Closed() to prevent it from panicking if already closed. Originally was thinking that a collector might close, but not be deleted. But then I added the delete because a new collector should be generated and a new file path should, so maybe no longer needed. will review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which line in close was panicking before? Adding this doesn't seem unreasonable to me and it's probably better to be defensive in this case re panics, moreso asking if the var is necessary to prevent that
@@ -469,6 +492,7 @@ func (svc *builtIn) Update(ctx context.Context, cfg *config.Config) error { | |||
} | |||
|
|||
toggledCaptureOff := (svc.captureDisabled != svcConfig.CaptureDisabled) && svcConfig.CaptureDisabled | |||
// toggledCaptureOn := (svc.captureDisabled != svcConfig.CaptureDisabled) && !svcConfig.CaptureDisabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this can be deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
|
||
// let run for a second | ||
time.Sleep(captureWaitTime) | ||
midCaptureFiles := getAllFiles(tmpDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why's this mid needed? I think only updated
should be necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added mid so it provides results after the disabled variable has been changed and that the collector gets closed and new data gets pushed out. It's a slightly diff value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we just want to test that the it capture is now (disabled | enabled) after Update is called, what does testing the file size twice get us vs just once? I think comparing updatedCaptureFiles
and initialFilzeSize
should accomplish the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually wait I see you have
initialCaptureFiles := getAllFiles(tmpDir)
initialCaptureFilesSize := getTotalFileSize(initialCaptureFiles)
before you call Update with the new status. I think you should take those measurements after the Update.
This would allow you to get rid of mid
while testing the same thing - testing that data was or was not initially being captures, then testing that data was or was not captured after the update
time.Sleep(captureWaitTime) | ||
updatedCaptureFiles := getAllFiles(tmpDir) | ||
updatedCaptureFilesSize := getTotalFileSize(updatedCaptureFiles) | ||
if !tc.newDisableStatus && tc.initialDisableStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add comments to each of these cases saying what they are? e.g. If disabled then enabled
The whole disabled instead of enabled thing made it hard for me to figure it out
return collector, &componentMetadata, nil | ||
} | ||
|
||
return nil, nil, errors.New("no collector was found with this config") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: with this config
-> with config <config string representation>
// capture always enabled | ||
test.That(t, len(updatedCaptureFiles), test.ShouldEqual, len(initialCaptureFiles)) | ||
} else { | ||
// capture ends disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: enabled then disabled to match first comment
@@ -957,3 +957,121 @@ func getTestModelManagerConstructor(t *testing.T, server rpc.Server, zipFileName | |||
return model.NewManager(logger, cfg.Cloud.ID, client, conn, &mockClient{zipFileName: zipFileName}) | |||
} | |||
} | |||
|
|||
func TestDataCapture(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the other capture disabled test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for explaining all the test logic to me 🚀
err = dmsvc.Update(context.Background(), testCfg) | ||
test.That(t, err, test.ShouldBeNil) | ||
|
||
// let run for a second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supernit: technically 25 milliseconds, so something like "Run capture for a moment"
err = dmsvc.Update(context.Background(), testCfg) | ||
test.That(t, err, test.ShouldBeNil) | ||
|
||
// let run for a second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like the table-driven tests, much cleaner than a lot of the existing tests.
Supernit comments, but thanks for diving into this! The (initially simple) config-based capture and sync logic has grown exponentially in Update, so definitely worth refactoring in the next few weeks to simplify and make as readable & bug-free as possible.
} | ||
} | ||
|
||
func getAllFiles(dir string) []os.FileInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice helper functions
|
Part 1 of the fix, dealing with the data capturing part. Making sure that when component capturing is disabled, the collector gets closed. As well, only initialize the collector if capturing isn't disabled to begin with.