Version comparison, but normalized to (best of each "era"? or maybe even separate points for v0.4.0/0.5.x vs. 1.x.x?) #60

jowens · 2020-05-08T23:44:01Z

I want to clearly be able to tell where v1.x.x is worse (or better) than older versions. Maybe taking a single GPU result (like K40c/80 --- because that's what we have used in past the most), and comparison the results for all datasets and all primitives.

I am considering this as two versions:

Everything 1.0 and beyond = 1.x.x
Everything before 1.0 = 0.x.x (0.4.0 and 0.5.x)

Because 0.5.x was a feature release.

EXCLUDING:

datasets that don't have results for both versions
0.3.0 results, I almost know nothing about the code of this version and I don't know how relevant it is now.

jowens · 2020-05-24T21:03:35Z

@neoblizz : would you recommend doing this on ... K40? V100? Both? We have enough data for both? I think we probably do.

neoblizz · 2020-05-26T00:52:02Z

Please do it with both. We have enough data for V100 and K40c, and maybe even some Titans (Xp, V).

jowens · 2020-05-26T02:11:02Z

OK. And just run this over ... the entire io repo?

Basically this graph, yes? gunrock/gunrock#642 (comment)

jowens · 2020-05-26T03:58:28Z

(Not quite that graph. But something not too different from that graph. Would be helpful to be able to pick a subset of gunrock/io to search through, if you knew where to look for pre-1.0 stuff.)

neoblizz · 2020-05-26T05:09:56Z

We can use these:

jowens · 2020-05-26T05:32:02Z

Ah, that's helpful, thank you.

jowens · 2020-05-28T01:53:10Z

OK, I ran these over the entire database. These 8 plots are consistent with gunrock/gunrock#642 , mostly. There are no K40 comparisons. Right now there is no differentiation between different options (undirected, mark_pred, etc.); it just takes the fastest one. Anything "above" the line at 1 is where a pre-1.0 version is faster than our fastest 1.0+ version.

There are no PageRank comparisons on K40 AFAICT.

If I'm missing something, let me know. Tell me what I need to do better, @neoblizz ?

jowens · 2020-05-28T01:53:48Z

(These look really scant, but our data is pretty scant. We didn't run over very many datasets until we got to 1.0.)

jowens · 2020-05-28T01:54:07Z

(Easy to leave out primitives that don't have comparisons.)

neoblizz · 2020-05-28T01:58:27Z

That's ok, I like the list of datasets we have. You can leave that. Also, hard for me to believe there's no PR run on K40, I'll look for that. These look perfect, they have a 1000 times more value to me when I can hover over it and actually see the exact config for that version/commit.

neoblizz · 2020-05-28T01:59:58Z

Off-topic: How difficult is it to set-up your script on say daisy? If we can create a GitHub action for this kind of graph, where whenever a new result is pushed onto a certain directory, it automatically updates these graphs and makes them live -- that will be really neat.

jowens · 2020-05-28T02:19:41Z

It is certainly helpful if you point to a run and say "why isn't that included in the comparison".

We could set it up on daisy. We'd have to install a bunch of stuff, but no biggie. (Altair etc. uses a lot of nodejs stuff, Selenium for headless PDF rendering, etc.).

jowens · 2020-05-28T02:21:59Z

(There are no results at all for PR/K40 on any Gunrock version, so clearly an error on my part, I'll look into that.)

neoblizz · 2020-05-28T03:09:17Z

(There are no results at all for PR/K40 on any Gunrock version, so clearly an error on my part, I'll look into that.)

https://github.com/gunrock/io/blob/master/gunrock-output/topc/PageRank_hollywood-2009_Thu%20Nov%2017%20230545%202016.json#L45 here's an example.

jowens · 2020-05-28T03:19:17Z

On it.

jowens · 2020-05-28T15:45:13Z

Ah. OK. After much debugging, it turns out the reason we don't have any PR speedups to show on K40 is that we don't have any 1.0+ PR results at all on K40. Sooooo that makes sense.

https://github.com/gunrock/io/tree/master/gunrock-output/v1-0-0/pr

jowens · 2020-05-28T18:34:57Z

And now this is more what I think we want. For each primitive, I have an "all" plot (one row, fastest for each primitive/GPU combo). Then I have a more detailed plot that separates out the options (which are different for each primitive) (multiple rows, one for each primitive/GPU/options combo). Gonna paste in one comment per primitive now. @neoblizz lemme know what you think.

jowens · 2020-05-28T18:35:40Z

jowens · 2020-05-28T18:36:04Z

jowens · 2020-05-28T18:36:34Z

jowens · 2020-05-28T18:36:49Z

Added gunrock-version comparisons gunrock/io#60

neoblizz · 2020-05-28T22:30:34Z

This is great, I really like this. Very easy to see what's going on.
I will run PR on K40c on Luigi and get you some results for a baseline.

jowens · 2020-05-28T22:37:57Z

I can't believe how many tests you ran, and still these plots are pretty sparse. Ah well. We have plenty of data points to explore though, and they're ~the same as the ones we found a couple of months back that @crozhon is looking at, so that's a great outcome.

neoblizz · 2020-05-28T22:45:32Z

This also helps me see what datasets we are missing for some primitives. I would really like to fully automate this process using actions and have some sort of feedback/report system that can inform is if things go wrong or stuff is missing.

It's a bit of effort to get that all set up though. But worth it.

jowens · 2020-05-28T22:56:33Z

Yeahhhhh I just worry that it's a lot of time to set up and it's not going to repay the effort it takes to set it up. I mean, we don't check in new results all that often. And every time we add something new I have to go muck with the scripts to make them support it anyway ... I guess I'd like to know that someone is going to use this a fair bit before we invest that time.

jowens · 2020-06-08T23:44:06Z

The thing we could do that's perhaps a little more straightforward is a non-graph-based python script that basically prints out a comparison. If the script was to input "here are the new files you checked in", it could perhaps print out a summary for each experiment ("the fastest previous run you did was X"). That would require, however, that we'd have presumably have a local copy of gunrock-output somewhere; not sure it would be kind to slurp tens of thousands of files from github every time we ran the script. I think this would be more useful than a graph, frankly. But, it's not particularly high priority.

jowens mentioned this issue May 8, 2020

Figures for PR, SSSP, BFS and BC. gunrock/gunrock#725

Open

jowens self-assigned this May 26, 2020

jowens added a commit to gunrock/docs that referenced this issue May 28, 2020

Merge pull request #20 from gunrock/develop

5e2d7e2

Added gunrock-version comparisons gunrock/io#60

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version comparison, but normalized to (best of each "era"? or maybe even separate points for v0.4.0/0.5.x vs. 1.x.x?) #60

Version comparison, but normalized to (best of each "era"? or maybe even separate points for v0.4.0/0.5.x vs. 1.x.x?) #60

jowens commented May 8, 2020

jowens commented May 24, 2020

neoblizz commented May 26, 2020

jowens commented May 26, 2020

jowens commented May 26, 2020

neoblizz commented May 26, 2020 •

edited

Loading

jowens commented May 26, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020 •

edited

Loading

jowens commented May 28, 2020

jowens commented Jun 8, 2020

Version comparison, but normalized to (best of each "era"? or maybe even separate points for v0.4.0/0.5.x vs. 1.x.x?) #60

Version comparison, but normalized to (best of each "era"? or maybe even separate points for v0.4.0/0.5.x vs. 1.x.x?) #60

Comments

jowens commented May 8, 2020

jowens commented May 24, 2020

neoblizz commented May 26, 2020

jowens commented May 26, 2020

jowens commented May 26, 2020

neoblizz commented May 26, 2020 • edited Loading

jowens commented May 26, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020

jowens commented May 28, 2020

neoblizz commented May 28, 2020 • edited Loading

jowens commented May 28, 2020

jowens commented Jun 8, 2020

neoblizz commented May 26, 2020 •

edited

Loading

neoblizz commented May 28, 2020 •

edited

Loading