-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add developer documentation for benchmarking #11122
Add developer documentation for benchmarking #11122
Conversation
…etails to an advanced topics section.
This PR ports the benchmarks in https://github.com/vyasr/cudf_benchmarks, adding official benchmarks to the repository. The new benchmarks are designed from the ground up to make the best use of pytest, pytest-benchmark, and pytest-cases to simplify writing and maintaining benchmarks. Extended discussions of various previous design questions may be found on [the original repo](https://github.com/vyasr/cudf_benchmarks). Reviewers may also benefit from reviewing the companion PR creating documentation for how to write benchmarks, #11122. Tests will not pass here until rapidsai/integration#492 is merged. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Bradley Dice (https://github.com/bdice) - Michael Wang (https://github.com/isVoid) - GALI PREM SAGAR (https://github.com/galipremsagar) - Matthew Roeschke (https://github.com/mroeschke) URL: #11125
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor queries
Codecov Report
@@ Coverage Diff @@
## branch-22.10 #11122 +/- ##
===============================================
Coverage ? 86.47%
===============================================
Files ? 144
Lines ? 22856
Branches ? 0
===============================================
Hits ? 19765
Misses ? 3091
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more minor comments, but overall looks good I think.
@shwina @wence- @isVoid thanks for your patience here. I think that I have addressed all the comments. I left open the few discussions that I thought still required some response. I also adapted what we discussed in #11180 (comment) into the discussion on how to handle parametrization in different scenarios. Note that there is significant overlap in some parts of this document with discussions on testing. However, since I anticipate this PR being merged before #11199 I am fine with getting this done and then consolidating the work in that PR. |
|
||
When it comes to parametrizing tests, we have a number of options at our disposal. | ||
One option is fixtures, while a second is using `pytest.mark.parametrize`. | ||
A third option is provided by the [`pytest_cases`](https://smarie.github.io/python-pytest-cases/) `pytest` plugin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We concluded there's a hierarchy of complexity, which is illustrated with some guidance a little bit further down this paragraph. It's not fully cut-and-dried, which is why this guidance is soft rather than hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that unless it's cut-and-dried, most developers would rather not deal with complexity.
If I develop a feature and need to write a test or benchmark for it, I'm going to choose the path of least resistance, which could include:
- writing as little code as possible
- touching as few files as possible
- learning as few new tools as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to pile on this discussion further (and late!).
I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.
But, it's quite mind-bending to think about the intersectionality of parameters for someone wanting to develop a feature and write incremental tests or benchmarks for it. Incorporating fixtures into this workflow is easy and familiar as one identifies reusable components between tests. Incorporating cases seems more difficult and really only possible in retrospect - or am I thinking of this all wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sufficiently familiar with cases to say, but I think that is reasonable. Probably you would start out writing fixtures, and then, as they became more baroque, wonder if there were a better way (which I believe cases offer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that anything that you've said is wrong or untrue, but I think you're classifying as a difference in kind something that is really only a difference in degree.
I would argue that incorporating fixtures effectively is also really only possible in retrospect. As our test suite clearly demonstrates, most people's starting point (at least historically) is to write an unparametrized test with some hardcoded objects, then abuse pytest.mark.parametrize
to test as many possibilities as you can. Getting everyone to switch to using fixtures already requires some amount of either 1) foresight into what tests you want to write, or 2) refactoring tests in hindsight. Typically the latter occurs as part of PR review, although over the past year or two I think many developers have become more accustomed to thinking in terms of reusable fixtures and reaching for them first.
Cases are just one more step up the complexity ladder. Instead of just thinking about how you might write a bunch of tests that take the exact same arguments, in order to start off with using cases you need to start thinking about tests that might have only partial overlap in parametrization. Otherwise, you refactor tests in hindsight. The review process will train people in those best practices in the same way that we've improved our fixture usage.
I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.
Just to make sure that I understand: am I interpreting this as saying that you're not yet convinced that we should use cases at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a poor job communicating what changes are required here.
I'm not unilaterally opposed to the use of cases
, but I feel I need a better understanding of how much friction, if any, we would be adding to the process of developing benchmarks and - more importantly - tests, if we require them.
I think before moving forward with this recommendation, it'd be great to have a deeper discussion about cases
offline. Perhaps with the rest of the team so we can get their feedback as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, bringing my experience from other projects that heavily used parameterized fixtures (but not pytest cases). I think there's a tradeoff between generality (and lack of code repetition) that the more complex approaches bring, and ease of debugging (and extraction) of a particular test when something goes wrong and you need to fix things. Although it is relatively easy to run a single test with pytest test_file::test_name
after it fails for debugging purposes, I have often found that what I end up doing is pulling that test out into a single file that I can run with none of the pytest machinery in the way. If it then depends on a bunch of fixtures this is painful because you have to find all those and so forth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this conversation unresolved for now so that it's easy to find when we finally come back to this discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial grammar nit, but otherwise looks good, thanks!
In the interest of getting this PR merged sooner rather than later, I'm removing the discussion of cases for now until we can have a discussion and come to some consensus on how they should be used. I'm going to copy the exact relevant text out of the current document into this comment and then remove it from the doc so that we can merge and then revisit. Discussion of casesIn the second case, fixtures are really functioning as parameters, which we discuss in the next section. Parametrization: custom fixtures,
|
This PR adds a primary developer guide for Python. It provides a more complete and informative landing page for new developers. When #11217, #11199, and #11122 are merged, they will all be linked from this page to provide a complete set of developer documentation. There is one main point of discussion that I would like reviewer comments on, and that is the section on directory and file organization. How do we want that aspect of cuDF to look? Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) - Ashwin Srinath (https://github.com/shwina) URL: #11235
@gpucibot merge |
This PR documents best practices for writing cuDF Python benchmarks. It includes an overview of the various fixtures provided by our benchmarking suite to all benchmarks and indicates how best to make use of them. It also discusses the various features of our benchmarking suite (including easy comparison to pandas and running in CI) and what developers must do to maintain compatibility with those features.
A PR to incorporate the cudf_benchmarks repo into cudf proper is imminent, but this documentation PR can be reviewed (and merged) independently.