Added a function rem_vertices! #1047

simonschoelly · 2018-10-09T21:48:16Z

This adds a function rem_vertices!(g, vertex_list; keep_order=false) -> vmap. This function allows to remove multiple vertices from a graph. It returns a vector vmap that maps the vertices from the modified graph to the vertices in the unmodified graph. A flag keep_order can be set to true to ensure that the order of the vertices is not changed.

codecov · 2018-10-09T21:59:35Z

Codecov Report

Merging #1047 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1047      +/-   ##
==========================================
+ Coverage   99.81%   99.82%   +<.01%     
==========================================
  Files          86       86              
  Lines        2746     2821      +75     
==========================================
+ Hits         2741     2816      +75     
  Misses          5        5

sbromberger · 2018-10-10T00:13:11Z

Thanks. From a quick read, this looks like it's O(V+E+something). Is my estimate correct?

simonschoelly · 2018-10-10T00:37:21Z

Yeah that seems about right. There is a note in the comments on how it would be possible to speed it up a bit more, but from running the code it seems to be quite faster than I expected anyway. At the moment it seems that even for removing a single vertex it might be faster than rem_vertex!.

sbromberger · 2018-10-10T17:17:50Z

At the moment it seems that even for removing a single vertex it might be faster than rem_vertex!.

This is surprising to me. What size graph are you trying this on?

simonschoelly · 2018-10-10T17:36:16Z

So either I had a very special kind of graph, where that was indeed the case, or I must have counted the number of digits wrong, because I cannot reproduce my claims at all. Looks more as if rem_vertex! is roughly ten times faster for a single vertex.

julia> g = erdos_renyi(10^4, 0.1)
{10000, 4998825} undirected simple Int64 graph

julia> g2 = copy(g); @time rem_vertex!(g2, 5000)
  0.002242 seconds (444 allocations: 6.743 MiB)

julia> g2 = copy(g); @time rem_vertices!(g2, [5000], keep_order=false)
  0.055374 seconds (18 allocations: 157.094 KiB)

julia> g2 = copy(g); @time rem_vertices!(g2, [5000], keep_order=true)
  0.012389 seconds (18 allocations: 157.094 KiB)

sbromberger · 2018-10-10T23:11:12Z

Those benchmarks look more correct. Swap-n-pop is inherently efficient as it requires one move, and an O(1) resize.

Does rem_vertices! scale linearly with the number of vertices removed?

simonschoelly · 2018-10-10T23:47:26Z

No it does not. The reason is, that in any case, the algorithm will check every edge. And if the order is kept, then this is also necessary most of the time.

julia> g = erdos_renyi(10^4, 0.1);
julia> a = randperm(10^4);

# 1 vertex
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:1], keep_order=false);
  0.054921 seconds (19 allocations: 157.125 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:1], keep_order=true);
  0.009877 seconds (19 allocations: 157.125 KiB)

# 10 vertices
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:10], keep_order=false);
  0.078055 seconds (19 allocations: 157.188 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:10], keep_order=true);
  0.011177 seconds (19 allocations: 157.188 KiB)

# 100 vertices
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:100], keep_order=false);
  0.122671 seconds (19 allocations: 157.938 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:100], keep_order=true);
  0.012469 seconds (19 allocations: 157.938 KiB)

# 1000 vertices
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:1000], keep_order=false);
  0.133891 seconds (19 allocations: 165.000 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:1000], keep_order=true);
  0.020724 seconds (19 allocations: 165.000 KiB)

# 5000 vertices
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:5000], keep_order=false);
  0.081602 seconds (21 allocations: 196.156 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:5000], keep_order=true);
  0.034571 seconds (21 allocations: 196.156 KiB)

# 10000 vertices
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:10000], keep_order=false);
  0.012222 seconds (20 allocations: 235.219 KiB)
julia> g2 = copy(g); @time rem_vertices!(g2, a[1:10000], keep_order=true);
  0.012641 seconds (20 allocations: 235.219 KiB)

sbromberger · 2018-10-11T00:27:35Z

Those benchmarks look suspicious. Why should removing 10k vertices take 1/8 the time that removing 5k vertices takes, and why should removing 5k vertices take half the time that removing 1k takes?

simonschoelly · 2018-10-11T01:02:37Z

The algorithm works in 3 phases:

In the first phase, we calculate a map from the old vertex labels to the new ones.
In the second phase, we move the lists in fadjlist to their right position and then resize fadjlist
In the third phase, we go over all the left over lists, and remove/relabel the vertices in there.
(also at some point we calculate the number of edges that get removed)

So if we remove a lot of vertices, then in the third phase, there will be less lists that we have to process. For the 10k case, there will be no lists left at all.

sbromberger · 2018-10-11T16:31:25Z

Thanks for the explanation. That makes sense.

sbromberger · 2018-10-11T16:38:04Z

src/SimpleGraphs/simpledigraph.jl

+function rem_vertices!(g::SimpleDiGraph{T},
+                       vs::AbstractVector{T};
+                       keep_order::Bool=false
+                      ) where {T}


If you're parameterizing SimpleDiGraph, then you should constrain T <: Integer. But more importantly, do you really want to insist that the vertex list is the same type as the graph eltype? Generally we just use Integer and cast internally as appropriate.

Isn't SimpleDiGraph{T} already restricted to T <: Integer by the definition of that datatype? I will change the vector.

It is, but our convention (to date) is to constrain parameters for clarity. I'm not opposed to changing that convention if it makes sense; this was just a suggestion.

sbromberger · 2018-10-11T16:48:09Z

src/SimpleGraphs/simpledigraph.jl

+    # Sort and filter the vertices that we want to remove
+    remove = sort(vs)
+    unique!(remove)
+    lo, hi = extrema(remove)


Since remove is sorted (I think unique preserves the ordering), then lo, hi = (remove[1], remove[end]) is much more efficient than extrema:

julia> a = rand(Int, 100_000_000); julia> sort!(a); julia> @benchmark extrema($a) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 53.086 ms (0.00% GC) median time: 58.190 ms (0.00% GC) mean time: 58.347 ms (0.00% GC) maximum time: 65.753 ms (0.00% GC) -------------- samples: 86 evals/sample: 1 julia> @benchmark ($a[1], $a[end]) BenchmarkTools.Trial: memory estimate: 0 bytes allocs estimate: 0 -------------- minimum time: 2.088 ns (0.00% GC) median time: 2.096 ns (0.00% GC) mean time: 2.200 ns (0.00% GC) maximum time: 34.643 ns (0.00% GC) -------------- samples: 10000 evals/sample: 1000

sbromberger · 2018-10-11T16:50:34Z

src/SimpleGraphs/simpledigraph.jl

+    remove = sort(vs)
+    unique!(remove)
+    lo, hi = extrema(remove)
+    (one(T) <= lo && hi <= n) ||


1 <= lo <= hi <= n works also.

sbromberger · 2018-10-11T16:57:21Z

src/SimpleGraphs/simpledigraph.jl

+    if keep_order
+        # traverse the vertex list and shift if a vertex gets removed
+        i = 1
+        Δ = 0


it seems like Δ is always one behind i. Why define it at all?

Right, I don't need that.

…into remove_vertices!

…ghtGraphs.jl into remove_vertices!

added function rem_vertices!

4d97af1

sbromberger added the hacktoberfest label Oct 11, 2018

Merge branch 'master' into remove_vertices!

44b6c98

sbromberger reviewed Oct 11, 2018

View reviewed changes

simonschoelly and others added 4 commits October 12, 2018 01:23

small changes

d784d8f

Merge branch 'master' of git://github.com/JuliaGraphs/LightGraphs.jl …

09db4bd

…into remove_vertices!

Merge branch 'remove_vertices!' of http://github.com/simonschoelly/Li…

acf979d

…ghtGraphs.jl into remove_vertices!

Merge branch 'master' into remove_vertices!

c42a150

sbromberger approved these changes Oct 12, 2018

View reviewed changes

sbromberger added 4 commits October 12, 2018 10:20

Merge branch 'master' into remove_vertices!

ace212d

Update simplegraph.jl

e8fbb52

Update simplegraph.jl

e3af5b7

Merge branch 'master' into remove_vertices!

d96037b

sbromberger merged commit dcbc9b2 into sbromberger:master Oct 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a function rem_vertices! #1047

Added a function rem_vertices! #1047

simonschoelly commented Oct 9, 2018

codecov bot commented Oct 9, 2018 •

edited

Loading

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 11, 2018

simonschoelly commented Oct 11, 2018

sbromberger commented Oct 11, 2018

sbromberger Oct 11, 2018

simonschoelly Oct 11, 2018

sbromberger Oct 11, 2018

sbromberger Oct 11, 2018

sbromberger Oct 11, 2018 •

edited

Loading

sbromberger Oct 11, 2018

simonschoelly Oct 11, 2018

Added a function rem_vertices! #1047

Added a function rem_vertices! #1047

Conversation

simonschoelly commented Oct 9, 2018

codecov bot commented Oct 9, 2018 • edited Loading

Codecov Report

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 10, 2018

simonschoelly commented Oct 10, 2018

sbromberger commented Oct 11, 2018

simonschoelly commented Oct 11, 2018

sbromberger commented Oct 11, 2018

sbromberger Oct 11, 2018

Choose a reason for hiding this comment

simonschoelly Oct 11, 2018

Choose a reason for hiding this comment

sbromberger Oct 11, 2018

Choose a reason for hiding this comment

sbromberger Oct 11, 2018

Choose a reason for hiding this comment

sbromberger Oct 11, 2018 • edited Loading

Choose a reason for hiding this comment

sbromberger Oct 11, 2018

Choose a reason for hiding this comment

simonschoelly Oct 11, 2018

Choose a reason for hiding this comment

codecov bot commented Oct 9, 2018 •

edited

Loading

sbromberger Oct 11, 2018 •

edited

Loading