Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complement of a hypergraph #380

Merged
merged 78 commits into from
Sep 9, 2023
Merged

Conversation

acombretrenouard
Copy link
Contributor

I wrote a function to complement a hypergraph.
It is computationally expensive, and to core function is a bit sketchy yet : I used strings and sets to enumerate all possible hyperedges, because I think it is quicker.

Let me know what you think !

@acombretrenouard
Copy link
Contributor Author

By the way : is it "a hypergraph" or "an hypergraph" ?
: )

@codecov
Copy link

codecov bot commented Jun 5, 2023

Codecov Report

Patch coverage: 93.79% and project coverage change: +0.97% 🎉

Comparison is base (387647a) 90.90% compared to head (a637640) 91.87%.
Report is 59 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #380      +/-   ##
==========================================
+ Coverage   90.90%   91.87%   +0.97%     
==========================================
  Files          42       60      +18     
  Lines        3177     4371    +1194     
==========================================
+ Hits         2888     4016    +1128     
- Misses        289      355      +66     
Files Changed Coverage Δ
xgi/algorithms/clustering.py 96.66% <ø> (ø)
xgi/algorithms/shortest_path.py 100.00% <ø> (ø)
xgi/linalg/hypergraph_matrix.py 100.00% <ø> (ø)
xgi/stats/edgestats.py 80.00% <ø> (ø)
xgi/stats/nodestats.py 100.00% <ø> (ø)
xgi/convert/bipartite_edges.py 44.44% <44.44%> (ø)
xgi/convert/incidence.py 76.92% <76.92%> (ø)
xgi/stats/dinodestats.py 82.35% <82.35%> (ø)
xgi/core/diviews.py 83.60% <83.60%> (ø)
xgi/stats/__init__.py 85.11% <84.00%> (+0.45%) ⬆️
... and 39 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

xgi/classes/function.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
xgi/utils/utilities.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
xgi/classes/function.py Outdated Show resolved Hide resolved
Comment on lines 950 to 953
if M == 0:
max_edge_size = 0
else: # there is at least one non-empty edge
max_edge_size = H.edges.order.max() + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be equivalent

Suggested change
if M == 0:
max_edge_size = 0
else: # there is at least one non-empty edge
max_edge_size = H.edges.order.max() + 1
max_edge_size = xgi.max_edge_order(H) + 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this : the function max_edge_order is defined above in the same python file (xgi/classes/functions.py). I could use it directly and write max_edge_size = max_edge_order(H) + 1 (with the xgi. prefix).

Is there a problem with this ?
The function complement might not work if it was placed above max_edge_order in the file, right ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's in the same file so you don't need the xgi.. The order of the functions in the file should not matter.

@leotrs
Copy link
Collaborator

leotrs commented Jun 5, 2023

Hi @acombretrenouard, thanks for your work on this.

I don't think we need the binary strings at all. You can use combinations from the stadard library, which does exactly the same thing (i.e. treat elements in a collection unique based on their position and not on their value). This is guaranteed to be faster.

xgi/classes/function.py Outdated Show resolved Hide resolved
@maximelucas
Copy link
Collaborator

Thanks Antoine! A few duplicate comments with Leo, we reviewed at the same time without knowing it.
I also agree with Leo that finding all possible hyperedges can be done in a simpler way:

  • we already have a powerset function that does that
  • or the complete_hypergraph(max_order=max_edge_size+1) function that creates a hypergraph with all possibles edges. Right now that functions creates nodes as range(N). If you want to use that, to preserve the original nodes, we should add an arguments nodes that specifies the nodes and overrides range(N) if not None.

@maximelucas
Copy link
Collaborator

Btw @leotrs @nwlandry did we not have a few basic hypergraph operations?
I tried H1 - H2 but it says "-" does not exist. Did we call it difference or something? In any case I couldn't find it in the docs, we should maybe make it more visible.

@leotrs
Copy link
Collaborator

leotrs commented Jun 5, 2023

H1.nodes - H2.nodes

nwlandry and others added 5 commits June 5, 2023 10:35
* Initial commit

* update docstrings

* moved read-only test

* added dihypergraph tutorial

* remove unused convert

* updated the node and edge stats

* style: format with black

* added ability to convert from dihypergraphs to hypergraphs

* Update Tutorial 8 - Directed Hypergraphs.ipynb

* added test for convert

* added source and target methods

* response 1 to review

* added first round of tests

* tests for stats

* new direportview file for organization.

* Update dihypergraph.py

* docs: fixed sphinx errors

* docs: added warning and updated docs
* docs: fix math part 2

* docs: corrected typos

---------

Co-authored-by: Pierre Robiglio <83019028+thomasrobiglio@users.noreply.github.com>
Co-authored-by: Maxime Lucas <maximelucas@users.noreply.github.com>
@acombretrenouard
Copy link
Contributor Author

acombretrenouard commented Jun 6, 2023

@nwlandry I think that what we are looking for is powerset (as @maximelucas pointed out) instead of combinations (we want subsets of all sizes...).

The problem with powerset is that it has no cutoff for the maximal size of hyperedges/subsets. I could add it though, and it would be close to what complete_hypergraph does.

However, once we have the full list L of all possible hyperedges, I don't see how we can pop those that are already present in the hypergraph we want to complement.
To me, we need to access the hyperedges in L knowing only their members (nodes), and not knowing their IDs.

@acombretrenouard
Copy link
Contributor Author

This thing with removing hyperedges from the list of all possible hyperedges to build Hc makes the use of the incidence matrix and/or a node ordering somehow necessary.
What do you think ?

@maximelucas
Copy link
Collaborator

The problem with powerset is that it has no cutoff for the maximal size of hyperedges/subsets. I could add it though, and it would be close to what complete_hypergraph does.

The powerset iterates over combinations: chain.from_iterable(combinations(s, r) for r in range(start, len(s) + end)). You could add a max_order argument that modifies end if specified to generate edges up to a certain order.

However, once we have the full list L of all possible hyperedges, I don't see how we can pop those that are already present in the hypergraph we want to complement. To me, we need to access the hyperedges in L knowing only their members (nodes), and not knowing their IDs.

One way would be to iterate over all those possible edges, and remove them from the list if they exist in the original HG. Otherwise if you could use the method difference for sets and but then you need to convert the list of hyperedges (all possible, and original) to sets.

@maximelucas
Copy link
Collaborator

This thing with removing hyperedges from the list of all possible hyperedges to build Hc makes the use of the incidence matrix and/or a node ordering somehow necessary. What do you think ?

I didn't understand what you're referring to, could you elaborate?

@acombretrenouard
Copy link
Contributor Author

acombretrenouard commented Jun 6, 2023

My last comment was on how to check if a possible hyperedge belong to the original hypergraph.

I didn't understand what you're referring to, could you elaborate?

I refer to the process of checking if a possible hyperedge belongs to the original HG :

One way would be to iterate over all those possible edges, and remove them from the list if they exist in the original HG.

How precisely do you think I can check this ?
Is there a way to access an hyperedge knowing only its nodes ?

@maximelucas
Copy link
Collaborator

How precisely do you think I can check this ?
Is there a way to access an hyperedge knowing only its nodes ?

You can do H.edges.members() to get the nodes inside each hyperedge.
Or H.edges.members(dtype=dict) (or equvalient but faster: H._edge) to have the same but keyed by hyperedge index.

Is that what you were looking for?

thomasrobiglio and others added 4 commits June 8, 2023 07:49
* feat : draw_dihypergraph function

* added figure to tutorial 8

* fix: added zorder for arrows

* test: added test for draw_dihypergraph

* added draw_dihypergraph to .rst file

* style: run black

* fix: reduce import

* feat: check if DiHypergraph + toggle for drawing squares

* test: added tests for check and marker toggle

* fix: correction by Leo + black

* fix: minor error in test

* feat: added possibility for edges labels

* fix: modified in order to handle hyperedges of size 2
* move json http request functions to utilities

* updated xgi_data module to reflect moving the json functions to utilities.

* added bigg_data module.

* style: format with black

* update docstrings

* update xgi_data module

* remove unnecessary imports

* added tests

* added docs

* Update bigg_data.py

* response to review

* added spaces
tlarock and others added 22 commits August 4, 2023 14:12
* Added weights option to to_line_graph function.

* Added tests for weighted line graph implementation.

* Added s=2 test.

* Added encapsulation dag code.

* Added encapsulation dag tests.

* Removed unnecessary combinations import.

* Updated docstring.

* included functions from publication (xgi-org#400)

* feat: added shuffle_hyperedges + tests

* feat: added shuffle_hyperedges + tests

* fix: filenames

* fix: tests

* style: black and isort

* feat: added function node_swap + test

* style: black and isort

* fix: error from python 3.11

* attempt at listing the available statistics (xgi-org#405)

* attempt at listing the available statistics

* fixing previous errors

* Attempt at implementing suggestions by nich

* second attempt at displaying the titles

* another trial

* titles in bold

* update of the panel for dihypergraph stats

* minor fixes

* change the hierarchical structure

* minor change

* moving MultiDi-stats in the right place

* adding decorators to the admonition box

* moving up the decorators

* up-version

* feat: enforce equal aspect for circular layout xgi-org#430 (xgi-org#432)

* feat: enforce equal aspect for circular layout xgi-org#430

* changed implementation of set_aspect

* Refactor draw module (xgi-org#435)

* refact: draw module

* style: black and isort

* fix xgi-org#331 (xgi-org#438)

* Added the ability for users to access the optional arguments of NetworkX layout functions. (xgi-org#439)

* Added another test.

* Refactored relations to subset_types. Exposed and documented empirical relations filtering function. Documentation improvements.

* Update xgi/convert/encapsulation_dag.py documentation

Co-authored-by: Maxime Lucas <maximelucas@users.noreply.github.com>

* Fix typo in xgi/convert/encapsulation_dag.py

Co-authored-by: Maxime Lucas <maximelucas@users.noreply.github.com>

* Updates to documentation.

* Minor refactor.

---------

Co-authored-by: Maxime Lucas <maximelucas@users.noreply.github.com>
Co-authored-by: Thomas Robiglio <83019028+thomasrobiglio@users.noreply.github.com>
Co-authored-by: Nicholas Landry <nicholas.landry.91@gmail.com>
* added cleanup method to all classes

* added more to the cleanup method

* `remove_node` and `remove_nodes_from` now inherit from `Hypergraph`

* fix 445 and add test

* added tests for SimplicialComplex and DiHypergraph

* lots of tests!!

* small fix

* small fix

* add tests

* `is_frozen` now a property

* updated docs

* added test
* Fix error when loading BiGG data

* format: isort and black
* created xgi in 1 min tutorial

* toning down a bit the texts

* working progress XGI in 5 min

* XGI in 5 min

* wip XGI in 15 mins

* complete XGI in 15 minutes

* Addressed comments on tut. 5 minutes

* Addressed comments on XGI in 15 minutes

* added hyperlinks in XGI in 1 minute

* comments from max

* minor modifications to xgi in 1 minute

* minor modifications to xgi in 5 minutes

* minor modifications to xgi in 15 minutes
* feat: return mappable to allow colorbar

* refact: simplified and improved draw_nodes by fully using plt.scatter

* docs: updated docstrings for draw_nodes. Removed node_ec_cmap

* fix: docstring

* fix: remove unused code

* feat: accept dict as input in draw

* add check for negative input values xgi-org#272

* feat: added node_shape argument in draw_nodes

* fix: updated draw and other version to match new draw_nodes. fix: most tests

* fix: last test

* docs: added colorbar note

* tests: fixed and added more. fix: now plotting nodes with non-finite color. warning: removed when cmap is ignoed because single color

* style: back and isort

* fix: raise for negative node size

* fix: raise for negative node lw

* test: added for cmap and vmin/vmax in draw_nodes

* tuto: new about draw_nodes

* tuto: added non-finite color values

* fix: import numpy in tuto

* test: try to fix import

* docs: added docstring to update_lims

* refact: new _draw_arg_to_arr function

* fix: add import

* trying to fix import problem

* lint: remove unused imports in draw

* trying to fix tests again: remove failing tests

* crazy idea: reimport numpy inside the faulty tests

* trying to remove dependency to np in one test

* tests: new for _draw_arg_to_arr

* style: black

* moved tests back to test_draw_utils - maybe they were moved in main after creating this branch?

* removed double import np - hopefully not necessary anymore

* feat: changed from clipping to interp between min/max. added 'rescaling_sizes' argument. test: added for new function.

* fix: consistent default vals in draw. feat: rescaling_sizes=True by default.

* minor review comment
* first commit

* updates

* removed `base` argument

* added unit tests

* style: format with isort and black

* response to review

* Update __init__.py

* response to review
* test: for node order

* fix: consistent node order
* Fixed xgi-org#458.

* Update contributors.rst

* update based on comments

* Update bigg_data.py

* Update test_bigg_data.py

* Update bigg_data.py
Co-authored-by: Maxime Lucas <maximelucas@users.noreply.github.com>
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@nwlandry
Copy link
Collaborator

nwlandry commented Sep 9, 2023

I did a rebase (sorry for the mess 😄) and I fixed the outstanding issues.

@nwlandry nwlandry merged commit 0c16910 into xgi-org:main Sep 9, 2023
@nwlandry nwlandry mentioned this pull request Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants