Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single threadpool #24

Merged
merged 5 commits into from
Feb 6, 2022
Merged

Single threadpool #24

merged 5 commits into from
Feb 6, 2022

Conversation

connelldave
Copy link
Owner

@connelldave connelldave commented Jan 20, 2022

Addresses #20

I think this works - seems to bench out at the same as one_phase :) I like the OO relationship of representing the host account and it's instantiated clients in this approach. Appreciate your thoughts @iainelder - it's definitely a first pass

It maintains the exposed var names so feels like it's possibly releasable as 1.5 rather than intentionally breaking any existing code too.

Would like to test properly and be able to compare the master version to this one.

Need to fix up all the tests, would be nice to be able to port your profiling suite into a pytest runnable thing for future too :)

@connelldave connelldave force-pushed the single_threadpool branch 3 times, most recently from 94ef9de to 229327e Compare January 20, 2022 21:58
Copy link
Collaborator

@iainelder iainelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the cove implementation yet, but I'll make time for it this week.

I've left some comments and questions on the new CI code in the meantime.

# make bootstrap
# make release

- name: Perform CodeQL Analysis
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about CodeQL! Looks like SonarQube or Semgrep.

In this PR I don't see any config for the rules it would use (I don't know what they would look like.) Is it using a default rule set?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird and magical, I just hit the "add codeQL" button and it gave me some yaml for an action :D I don't get much chance to use Github these days so thought it couldn't hurt to stick this in to try. It's using whatever it's default rules are, but I did have to squash a bunch of false positives in the action itself so it's storing state somewhere too.

I can't say I'm super impressed so far, but SAST tools always feel a bit awkward - I dropped Bandit in this PR too as it irritated me enough


- name: Lint with mypy
run: poetry run mypy .
run: poetry run pre-commit run --all-files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I love pre-commit!

hooks:
- id: pytest
name: pytest
entry: poetry run pytest tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a failing unit test prevent me from ever committing my code? (I think Pytest has an annotation to allow failures.)

Should I have to wait for the unit tests to complete every time I make an incremental commit? (I haven't checked yet to see how long they take now.)

I don't mind linters and other static analyzers here in pre-commit, but I would tend to avoid having such a heavyweight check in the pre-commit.

What do you think?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a mildly controversial choice but I like it for how it shapes thinking and keeps testing front of mind - the tests/ folder is fast and should always be kept fast or it'd be a regression/bad change in general. We can use a differnent test dir for slower/integ tests

Pros are that as a tool that we don't really want to break the API contract established on, encouraging tests to be kept up to date per commit and small units of change is a positive. Additional features shouldn't be too burdonsome to add tests alongside at commit time, and failing commits can still be pushed with --no-verify for WIP changes on branches.

That said, it's super annoying for this kind of PR and any other big code arch changes. Hopefully this will tbe be the last of those! If there's feedback it's a net negative I'm open to dropping it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's try it!

@iainelder
Copy link
Collaborator

Need to fix up all the tests, would be nice to be able to port your profiling suite into a pytest runnable thing for future too

This might be a bold step, but I'd like to remove the part of CoveSessions (or whatever may replace it) that deduplicates the account IDs. Allowing duplicate account IDs is just so useful for load testing.

If you are aware of customer code that already depends on the deduplication, then maybe it could be optionally disabled.

@connelldave
Copy link
Owner Author

Need to fix up all the tests, would be nice to be able to port your profiling suite into a pytest runnable thing for future too

This might be a bold step, but I'd like to remove the part of CoveSessions (or whatever may replace it) that deduplicates the account IDs. Allowing duplicate account IDs is just so useful for load testing.

If you are aware of customer code that already depends on the deduplication, then maybe it could be optionally disabled.

I guess there's no huge value in deduplication but who knows what customer code is in the wild at this point. I expected this change to break external behaviour enough to warrant a 2.0 release but it doesn't feel like that's the case so far. We could just monkeypatch somewhere in the test though?

@iainelder
Copy link
Collaborator

Testing it with my new profiling code shows that it solves the memory problem.

That for me is enough reason to merge this as soon as you're ready.

It should allow me to keep running botocove on my local machine without having to set up an EC2 instance for bigger jobs.

Then we could focus on some of the other improvements on the backlog :-)

@iainelder
Copy link
Collaborator

If you think the new profiling code is useful, should I prepare a new PR when this is merged?

@connelldave connelldave marked this pull request as ready for review February 6, 2022 20:56
@connelldave
Copy link
Owner Author

If you think the new profiling code is useful, should I prepare a new PR when this is merged?

Would be good! I'm thinking of making botocove itself callable with a CLI, which might be a nice way to implement profiling...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants