Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test performance regression for transform #14

Merged
merged 28 commits into from
Oct 3, 2024

Conversation

Anirban166
Copy link
Owner

No description provided.

Copy link

github-actions bot commented Jul 18, 2024

Comparison Plot

Generated via commit 2436c62

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 3 minutes and 13 seconds
Installing different package versions 1 minutes and 52 seconds
Running and plotting the test cases 22 seconds

@Anirban166
Copy link
Owner Author

Seems like the GitHub server outage has been resolved.

@Anirban166
Copy link
Owner Author

Since the transform regression fixing PR (Rdatatable#5493) did not get merged into the master branch directly (was added as a part of 1.15.99), I can't use commits from that PR (branch is deleted too).

As I see, this is the commit where the fix to that transform regression was introduced in data.table's master (I can see other commits being mentioned in the PR thread, but all of them are on deleted branches and only this commit alone is in master).

So I think that should be the 'Fast'/'Fixed' commit. For the 'Slow'/'Regression' one, I picked the parent (Rdatatable@bf49909) of that commit.

But when I use 'Slow' and 'Fast' labels I get this plot:

image
which is obviously not correct or the results we are expecting given multiple reasons:

  • Both of these labels being together
  • HEAD and base are somehow worser than the Slow label
Slow = "bf499090c0e6fd5cb492bf8b1603d93c1ee21dfb"
Fast = "2d1a0575f87cc50e90f64825c30d7a6cb6b05dd7"

And worse, when I use 'Before', 'Regression', and 'Fixed' with the version for the first two being the same as the 'Slow' one (just to see what that would result in - ideally they should be together since they are the same noh?) and the version for 'Fixed' being the same as 'Fast' above, I get:

image
which again is incorrect and even more confusing when comparing to the one above, like:

  • Before and Fixed are together (I expected 'Before' and 'Regression' to be together since they are the same commit SHAs)
  • HEAD and base are together with 'Regression' now, unlike the case above
Before = "bf499090c0e6fd5cb492bf8b1603d93c1ee21dfb"
Regression = "bf499090c0e6fd5cb492bf8b1603d93c1ee21dfb"
Fixed = "2d1a0575f87cc50e90f64825c30d7a6cb6b05dd7"

Test case:

  # Fixed in: https://github.com/Rdatatable/data.table/pull/5493 (off-branch)
  # Merged to master in: https://github.com/Rdatatable/data.table/commit/2d1a0575f87cc50e90f64825c30d7a6cb6b05dd7
  "transform improved in #5493" = atime::atime_test(
    N = 10^seq(1, 20),
    setup = {
      df <- data.frame(x = runif(N))
      dt <- as.data.table(df)
    },
    expr = data.table:::`[.data.table`(transform(dt, y = round(x))),
    ...

@tdhock Do you know as to what may be happening? (I for one am confused and have no idea why the results are like this)

@Anirban166
Copy link
Owner Author

(Also, when I first tried different versions/SHAs for this weeks back, I remember checking commits from data.table.R's history as well since it seems from the plots I got earlier that there have been more improvements that affects this transform test case)

@tdhock
Copy link

tdhock commented Aug 15, 2024

I don't have time to look in detail but you are right this is clearly not working, you need to revise.
you should be able to un-delete branches under Rdatatable/data.table, go to relevant PR and click the button at the bottom.
Rdatatable#5493 is on a fork (OfekShilon/data.table) so you are right, atime can not pick commits from forks, so you should get historical commits from master before/after merging that PR.
i guess you could create a new branch under rdatatable/data.table with copies of commits from Rdatatable#5493, but I don't think that should be necessary (you should be able to get historical commits from master)

.ci/atime/tests.R Outdated Show resolved Hide resolved
.ci/atime/tests.R Outdated Show resolved Hide resolved
@Anirban166 Anirban166 merged commit 2e15a9e into master Oct 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants