Skip to content
This repository has been archived by the owner on Oct 8, 2021. It is now read-only.

improved performance of core_number - credit to @abhinavmehndiratta #1281

Merged
merged 4 commits into from
Jan 29, 2020

Conversation

sbromberger
Copy link
Owner

@sbromberger sbromberger commented Jan 26, 2020

julia> @benchmark core_number(g)                                                                                                             
BenchmarkTools.Trial:                                                                                                                        
  memory estimate:  1.71 GiB                                                                                                                 
  allocs estimate:  12656029                                                                                                                 
  --------------                                                                                                                             
  minimum time:     8.276 s (9.20% GC)
  median time:      8.276 s (9.20% GC)
  mean time:        8.276 s (9.20% GC)
  maximum time:     8.276 s (9.20% GC)
  --------------
  samples:          1
  evals/sample:     1
julia> @benchmark core_number4(g)                                     
BenchmarkTools.Trial:                                                 
  memory estimate:  1.04 GiB                                          
  allocs estimate:  7963418                                                                                                                  
  --------------                                                      
  minimum time:     2.022 s (4.76% GC)
  median time:      2.030 s (4.46% GC)
  mean time:        2.029 s (4.46% GC)
  maximum time:     2.034 s (4.45% GC)
  --------------
  samples:          3
  evals/sample:     1
julia> g
{500000, 10000000} directed simple UInt32 graph

@sbromberger sbromberger force-pushed the sbromberger/core_number branch from 4ebba74 to c907f60 Compare January 26, 2020 02:18
@codecov
Copy link

codecov bot commented Jan 26, 2020

Codecov Report

Merging #1281 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #1281   +/-   ##
=======================================
  Coverage   99.66%   99.66%           
=======================================
  Files         104      104           
  Lines        5010     5010           
=======================================
  Hits         4993     4993           
  Misses         17       17

@sbromberger sbromberger force-pushed the sbromberger/core_number branch from c907f60 to cc937b6 Compare January 26, 2020 02:35
@sbromberger sbromberger added the efficiency / performance related to speed/memory performance label Jan 26, 2020
@@ -1,5 +1,3 @@
# Code in this file inspired by NetworkX.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this?

Copy link
Owner Author

@sbromberger sbromberger Jan 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I had.

(edit: yes. That's what the red / - means :) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I meant as in, should this stay?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. The code was a 1:1 transcription of NetworkX before; now it's not. There's nothing tying this to NetworkX at this point.

n = nv(g)
deg = T.(degree(g)) # degree should really return T.
maxdeg = maximum(deg)
bin = zeros(T, maxdeg+1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of very short names here, should we either comment on what each is and/or use some longer names for the ones used throughout the function (not the ones local to a single for loop or so

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @abhinavmehndiratta to provide better variable names.

@matbesancon
Copy link
Contributor

Other than the two minor things I left, this looks good, as it's just accelerating an existing algorithm without changes in apparent behaviour, it should be fairly consensual

@sbromberger
Copy link
Owner Author

@abhinavmehndiratta is there potential for parallelizing this?

@abhinavmehndiratta
Copy link
Contributor

@sbromberger
I've added some comments, hope you find them useful !

function core_number(g::AbstractGraph{T}) where T
    has_self_loops(g) && throw(ArgumentError("graph must not have self-loops"))
    n = nv(g)    
    deg = T.(degree(g)) # this will contain core number for each vertex of graph
    maxdeg = maximum(deg) # maximum degree of a vertex in graph
    bin = zeros(T, maxdeg+1) # used for bin-sort and storing starting positions of bins
    vert = zeros(T, n) # contains the set of vertices, sorted by their degrees
    pos = zeros(T, n) # contains positions of vertices in array vert

    # count number of vertices will be in each bin
    for v = 1:n
        bin[deg[v]+1] += one(T)
    end
    # from bin sizes determine starting positions of bins in array vert 
    start = one(T)
    for d = zero(T):maxdeg
        num = bin[d+1]
        bin[d+1] = start
        start += num
    end
    # sort the vertices in increasing order of their degrees and store in array vert
    for v in vertices(g)
        pos[v] = bin[deg[v]+1]
        vert[pos[v]] = v
        bin[deg[v]+1] += one(T)
    end

    # recover starting positions of the bins
    for d = maxdeg:-1:one(T)
       bin[d+1] = bin[d]
    end
    bin[1] = one(T)

    # cores decomposition
    for i = 1:n
        v = vert[i]
        # for each neighbor u of vertex v with higher degree we have to decrease its degree and move it for one bin to the left
        for u in all_neighbors(g, v)
            if deg[u] > deg[v]
                du = deg[u]
                pu = pos[u]
                pw = bin[du+1]
                w = vert[pw]
                if u != w
                    pos[u] = pw
                    vert[pu] = w
                    pos[w] = pu
                    vert[pw] = u
                end
                bin[du+1] += one(T)
                deg[u] -= one(T)
            end
        end
    end

    return deg
end

@sbromberger
Copy link
Owner Author

Once this passes tests, given @matbesancon 's and my review, I'll merge.

@sbromberger sbromberger merged commit 7c685b8 into master Jan 29, 2020
@sbromberger sbromberger deleted the sbromberger/core_number branch February 17, 2020 02:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
efficiency / performance related to speed/memory performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants