Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SSAPI Pagination (BPS-276) #1965

Open
wants to merge 5 commits into
base: feat/ssapi-receiver
Choose a base branch
from

Conversation

Caleb-Hurshman
Copy link

Proposed Change

Paginate Splunk search result responses. Add an event_batch_size config option to control how many events (search results) to fetch at a time.

Checklist
  • Changes are tested
  • CI has passed

@Caleb-Hurshman Caleb-Hurshman requested review from schmikei and removed request for a team and dpaasman00 November 13, 2024 15:56
@@ -32,6 +32,7 @@ type Config struct {
Username string `mapstructure:"splunk_username"`
Password string `mapstructure:"splunk_password"`
Searches []Search `mapstructure:"searches"`
EventBatchSize int `mapstructure:"event_batch_size"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be on the search configuration itself?

Copy link
Author

@Caleb-Hurshman Caleb-Hurshman Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess. If this is something that a user would want to change for different queries, then that's not too hard to accomodate. In my head, this setting was more useful for weighing how much load should be placed on the Splunk API vs. on the receiver and would be the same across all searches.

receiver/splunksearchapireceiver/api.go Outdated Show resolved Hide resolved
receiver/splunksearchapireceiver/model.go Show resolved Hide resolved
if logTimestamp.UTC().After(latestTime.UTC()) {
ssapir.logger.Info("skipping log entry - timestamp after latestTime", zap.Time("time", logTimestamp.UTC()), zap.Time("latestTime", latestTime.UTC()))
// logger will only log up to 10 times for a given code block, known weird behavior
// TODO: Consider breaking, assuming all logs are in order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair thing to look in to; we'd need to make sure that the events being retrieved from Splunk are in order, which I'm not confident to say one way or another on quite yet

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took some time to look into this, the pattern I'm seeing is that events are returned latest to earliest, which makes sense based on how the Splunk search UI works. So it would be correct to break once a log is earlier than the earliest time.

var resultCountTracker = 0 // track number of results exported
var offset = 0 // offset for pagination
var limitReached = false
for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set up some time to vscode share a solution on polling; feel like having an infinite for loop in this case could actually be improved if we start looking at context.Context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants