Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmarking infrastructure #24

Merged
merged 17 commits into from
Jan 5, 2024

Conversation

Schefflera-Arboricola
Copy link
Member

issue #12

here I implemented a basic benchmarking infrastructure using ASV for comparing the parallel and normal implementations of the betweenness_centrality function. I couldn’t figure out how to run it for different number of nodes and edge probabilities, we would probably have to define our own benchmarking classes for that.

The steps to run the benchmarks are in README.md (result ss) :
Screenshot 2023-10-24 at 10 09 47 PM

Please give your feedback to better it.

Thank you :)

References: ASV docs, networkx benchmarking, numpy benchmarking

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

I think the parameters for number of nodes and edges can be passed into the test function in a similar way you have done with algo_type. The values are specified in params, and they get passed into the test function as input parameters.

I have a comment below too. :)

def time_betweenness_centrality(self, algo_type):
num_nodes, edge_prob = 300, 0.5
G = nx.fast_gnp_random_graph(num_nodes, edge_prob, directed=False)
if algo_type == "parallel":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also have a third algo_type, where we run it using a keyword instead of converting the graph to a parallel_graph. That is, the call would be something like:

        - = nx.betweenness_centrality(G, backend=nx-parallel’)

I guess we better make sure that works first, and gives the same results. But it should work and it’d be nice to know if they are the same speed. Of course, if they are almost identical all the time, then we probably should remove it again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dschult
the output is same for both the ways(i.e. converting into a parallel graph and using keyword argument)when I round the betweenness_centrality value up to 2 decimal places, otherwise, it does give different values sometimes.

and yes, both the ways are almost identical in terms of time, but I've still added it in the comments here if someone wants to try it. There is some difference for 500 nodes graphs as observed in the following ss :
Screenshot 2023-11-03 at 5 24 33 PM

But this difference seem negligible when we bring sequential into the picture(as seen in the following ss) :
Screenshot 2023-11-03 at 5 26 34 PM

also, i have added num_nodes and edge_prob and changed normal algo_type to sequential in this new commit

Thank you :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, should I create a PR to change the plugin name from parallel to nx-parallel, if that's required?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for those time comparisons. I'm glad it doesn't change the time much between using the backend by object type and by using it via a keyword arg.

I think the naming conventions for these backend libraries is supposed to be, e.g. nx-cugraph when the plugin name is cugraph. I will check with other people though.

The additional parameter graph_types you mention are already obtained from the graph object. So the backend gets that info via, e.g. G.is_directed().

Graph generators will need to be done either in networkx before calling the function, or in nx-parallel by writing a parallel version of that function. I believe the PR with the graph generators getting @dispatch decorators has been merged. The weight parameter should be passed through to the backend function. If it is None then we use an unweighted approach.

I haven't looked at the new commit yet, but hope to soon. :}

@dschult dschult added the type: Enhancement New feature or request label Oct 25, 2023
@Schefflera-Arboricola
Copy link
Member Author

Schefflera-Arboricola commented Nov 7, 2023

I wanted some guidance on how we should decide to include more parameters. Like I was thinking to include the following params and param_names :

graph_type : ["undirected", "directed", "multigraph”]
is_weighted : [weighted, unweighted]
graph_generation : [“fast_gnp_random_graph”, “erdos_renyi_graph”, “watts_strogatz_graph”, “barabasi_albert_graph”]

But, when should we include a new parameter, when it’s very different(in terms of time) from the undirected, unweighted graph made by nx.fast_gnp_random_graph or we should include different graph_types(and other parameters) just because that algorithm can take in different graph types?

Thank you :)

[EDIT] I tried running benchmarks for graph ["directed", "undirected"] and is_weighted ["True", "False"] (I used random.random() to assign random weights to edges and G = nx.fast_gnp_random_graph(num_nodes, edge_prob, seed=42, directed=True) to create random directed graphs)there wasn't a lot of variation from the undirected, unweighted graph at least for betweenness_centrality(as seen in the following ss) :

Screenshot 2023-11-08 at 3 31 11 PM negligible variation in directed and undirected graphs



Screenshot 2023-11-08 at 3 29 45 PM negligible variation in weighted and unweighted graphs



Screenshot 2023-11-08 at 3 27 13 PM all benchmarks

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some big picture questions:

  • should we use a class here? Does asv require it? It seems like we are really just setting up a function (sorry if I am just forgetting what asv needs here...)
  • It seems like a separate module for each benchmark will become cumbersome once we have lots of timings to do. How should we organize the timings? Is there another scientific-python oriented repo that has an established directory structure for benchmarks that we could follow?

benchmarks/benchmarks/bench_centrality.py Outdated Show resolved Hide resolved
benchmarks/benchmarks/common.py Outdated Show resolved Hide resolved
benchmarks/benchmarks/common.py Outdated Show resolved Hide resolved
benchmarks/benchmarks/common.py Show resolved Hide resolved
benchmarks/benchmarks/bench_tournament.py Outdated Show resolved Hide resolved
@Schefflera-Arboricola
Copy link
Member Author

Schefflera-Arboricola commented Nov 26, 2023

Replying to Some big picture questions comment

Current benchmarks directory's structure

  • each bench_ file corresponds to a folder/file in the networkx/algorithms directory in NetworkX
  • each class inside a bench_ file corresponds to every file in a folder(one class if it’s a file) in networkx/algorithms
  • The class name corresponds to the file name and the x in bench_x corresponds to the folder name(class name and x are same if it’s a file in networkx/algorithms)
  • Each time_ function corresponds to each function in the file.
  • In future, for other folders in networkx/networkx like generators, classes, linalg, utils etc. we can have different bench_ files for each of them having different classes corresponding to different files in each of these folders.

Advantage of the above structure:

  • nicely organized, easy to navigate, and easy to add more benchmarks.

should we use a class here? Does asv require it? It seems like we are really just setting up a function (sorry if I am just forgetting what asv needs here...)

I have used classes just to keep it all organised.

Is there another scientific-python oriented repo that has an established directory structure for benchmarks that we could follow?

I will see if there is a better way of organizing. And, please let me know what you think of the current way.

Thank you :)

@Schefflera-Arboricola
Copy link
Member Author

@dschult I've made the suggested updates pls let me know this looks good to you.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that summary -- and description of the file/class/name arrangement.
I think there may be functions where we want more than one timing method for that function. But we can do that by adding descriptive words to the time_* method names.

All this big picture stuff looks good to me. We should revisit it after implementing a few more and make sure it is working well. :)

====== get_graph vs setup and how it fits with timing_func
The current PR creates the graph as part of the function call being timed. So we will be timing both the graph construction and the algorithm (I think).
Let's try using a setup method to create the graph followed by a time_<name> method to call the function on that graph.

For timing_func, there seems to be a lot of renaming of information that doesn't add much. But again, I could be missing the point so if I'm missing the point, let me know. What I am seeing is that for vitality we use names get_graph, func=, Vitality, Benchmark and none of them seem to provide much beyond what we could just do ourselves inline. The code is:

class Vitality(Benchmark):
    params = [(algo_types), (num_nodes), (edge_prob)]
    param_names = ["algo_type", "num_nodes", "edge_prob"]

    def time_closeness_vitality(self, algo_type, num_nodes, edge_prob):
        timing_func(
            get_graph(num_nodes, edge_prob), algo_type, func=nx.closeness_vitality

while we could have:

class VitalityBenchmark:
    params = [algo_types, num_nodes, edge_prob]
    param_names = ["algo_type", "num_nodes", "edge_prob"]
    
    def setup(self, algo_type, num_nodes, edge_prob):
        return nx.fast_gnp_random_graph(num_nodes, edge_prob, seed=42, directed=False)

    def time_closeness_vitality(self, G, algo_type, num_nodes, edge_prob):
        return nx.closeness_vitality(G, backend=algo_type)

Then when reading the code, we don't need to go look up what is in timing_func, and (for other functions) we could include parameters that don't get changed during this timing run.

Notice that I moved the graph creation outside of the timing method into the setup method. That's so we only compare the computation time, and not the construction time.

Notice that with this formulation, the algo_type would be a backend name (with None representing the sequential networkx code). What do you think?

benchmarks/benchmarks/common.py Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
@Schefflera-Arboricola
Copy link
Member Author

Schefflera-Arboricola commented Nov 27, 2023

@dschult thank you for the detailed and insightful review!

yeah, it's better if we have graph creation and timing functions separate. I guess I just did it to make the code more compact, thinking that the graph creation time will be included for both the algo_types(but, that won't be correct timing for the functions, now that I think about it. Thanks for pointing it out!). Also, there might be some cases where we might need different initial graphs for different functions in the same class. Then I think we can have if conditions in setup function.

About the base Benchmark class, I was referring to the scipy and the numpy libraries and they have a base class(with just a pass statement) for all benchmarks. But, in pandas, there is no common base class. So, I decided to have a common base class for all the benchmark classes just in case in the future we would like to add something to it. Please let me know your thoughts on this.

Also, I agree using backend="parallel" is better than H = nxp.ParallelGraph(G), because it does take some extra time to convert it to a ParallelGraph(not a lot, but around 0.00000405... seconds). Also, this way we won't have to have theif algo_type ==* condition and the timing_func will then become just a one-line function, so we can get rid of it.

Also, I think it will be better to make it backends instead of algo_types.

Thank you :)

@Schefflera-Arboricola
Copy link
Member Author

Schefflera-Arboricola commented Nov 27, 2023

Also, I think we should keep the get_graph function because then the graphs will be saved in the cache(because of @lru_cache(typed=True)) so it would take less time when we will be running all the benchmarks. Pls let me know your thoughts on this :)

@dschult
Copy link
Member

dschult commented Nov 27, 2023

I expect that we will need many different types of graphs for the algorithms. So the get_graph May not be able to be used very often. At least that is what happens with tests and pytest. We create a test class and put some graphs that could be used by lots of tests, but then they end up only being used by a couple of the tests and the rest construct their own graph. So we may want to create a different setup function for many of the timing functions. I think we’ll have to wait to see what is most useful. Keeping it as a common function is fine with me. Could we name it something like: cached_gnp_random_graph?

Does caching work well with floating point inputs like p? Nice binary values like 0.25 should work, but round-off can occur with 0.3, e.g. I guess if the roundoff is the same for each call of the function we’re ok.

In general, caching is nice for time consuming parts that either take a lot of time, or get repeated many times. We’ll see as we get more examples whether the timing is taking a long time and how much of that time is constructing the graphs. Let’s leave it in for now.

@Schefflera-Arboricola
Copy link
Member Author

Schefflera-Arboricola commented Nov 27, 2023

Does caching work well with floating point inputs like p? Nice binary values like 0.25 should work, but round-off can occur with 0.3, e.g. I guess if the roundoff is the same for each call of the function we’re ok.

if the combination of arguments is the same then the function is not implemented and the graph is returned from the cache directory, it is independent of the type of the argument.

@dschult
Copy link
Member

dschult commented Nov 27, 2023

Ahh.. But is 0.3 and 0.29999999999999 the same? How does the caching software determine that the values are the same? It looks like lru_cache doesn't reliably cache for floating points (due to roundoff) unless the value is created in the same way each time. I think this will work for us so long as the value from e.g. edge_prob is not recalculated a different way (e.g. by range(0.1, 1.0, 0.1))

@functools.lru_cache
def f(x):
    print(x)

f.cache_info()
# CacheInfo(hits=0, misses=0, maxsize=128, currsize=0)

f(3/10)
# 0.3

f(1/10*3)
# 0.30000000000000004                (round-off)

f.cache_info()
# CacheInfo(hits=0, misses=2, maxsize=128, currsize=2)    <-- 3/10 and 1/10*3 are cached as different results

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR.
Are there other issues to discuss as part of this PR?
We can always updater things in a separate PR.

@Schefflera-Arboricola
Copy link
Member Author

I think it's ready. Just made a few changes in README :)

@MridulS MridulS merged commit aeb4eb2 into networkx:main Jan 5, 2024
7 checks passed
@jarrodmillman jarrodmillman added this to the 0.1 milestone Jan 5, 2024
@jarrodmillman jarrodmillman modified the milestones: 0.1, 0.2 May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: Enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

4 participants