Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First Implementation of a Simplex Trie #220

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ffl096
Copy link
Member

@ffl096 ffl096 commented Aug 18, 2023

This implements a simplex trie as presented in [1] as backend data structure for the SimplicialComplex class. This is also used in gudhi's SC implementation. However, they do not expose all functionality we need and the data structure is implemented in native code, so we cannot interact with it directly either.

Using a simplex tree should bring some nice performance improvements over the previous approach and fixes some bugs along the way as well. I will add some comparisons later.

[1] Jean-Daniel Boissonnat and Clément Maria. The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes. Algorithmica, pages 1–22, 2014

@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch from e07d86b to f854940 Compare August 18, 2023 14:51
@ffl096
Copy link
Member Author

ffl096 commented Aug 18, 2023

@mhajij The tests fail because coseg loads a pickled state of SimplicialComplex with internal properties. This is (unrelated to this pull request) a bad idea, as any change of the data structure may lead to errors, or worse undetected inconsistencies.

@mhajij
Copy link
Member

mhajij commented Aug 18, 2023

@mhajij The tests fail because coseg loads a pickled state of SimplicialComplex with internal properties. This is (unrelated to this pull request) a bad idea, as any change of the data structure may lead to errors, or worse undetected inconsistencies.

@ffl096
I am not sure we should merge this pull request now because the ICML challenge participants might have used that dataset and I think we need to merge the pull request they have their first before we merge this particular pull request. What do you think?

@mhajij mhajij self-requested a review August 18, 2023 21:34
@ffl096
Copy link
Member Author

ffl096 commented Aug 19, 2023

This is a draft pull request, it is not to be merged right now regardless :)

However, just to clarify: I do not propose to remove the coseg dataset. We have to think about a reasonable data format to deliver the dataset that does not rely on pickle. Ideally, the return value of the coseg function should stay exactly the same.
SimplicialComplex objects in this pr are compatible to the previous implementation as long as the user does not access internal state. The ICML submissions should all be fine.

@mhajij
Copy link
Member

mhajij commented Aug 19, 2023

This is a draft pull request, it is not to be merged right now regardless :)

However, just to clarify: I do not propose to remove the coseg dataset. We have to think about a reasonable data format to deliver the dataset that does not rely on pickle. Ideally, the return value of the coseg function should stay exactly the same. SimplicialComplex objects in this pr are compatible to the previous implementation as long as the user does not access internal state. The ICML submissions should all be fine.

we need to create a Data object to be utilized in the higher order context. I think the one available in torch is good enough.

This is an example on how it can be used in a higher order DL model https://github.com/pyt-team/TopoModelX/blob/569bd193f81d47e04891376676c034e90cc07554/tutorials/combinatorial/hmc_train.ipynb

@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch from f854940 to 2bf961b Compare September 14, 2023 13:33
@mhajij
Copy link
Member

mhajij commented Sep 14, 2023

@ffl096 I think we can merge this now, testing is failing however, can you please take care of it so we can merge ? also lint.

@ffl096
Copy link
Member Author

ffl096 commented Sep 14, 2023

The dataset issue still stands and is outside of the scope to be fixed here. We cannot reliably use pickled objects as data objects.

@mhajij
Copy link
Member

mhajij commented Sep 14, 2023

The dataset issue still stands and is outside of the scope to be fixed here. We cannot reliably use pickled objects as data objects.

I cannot merge wihout passing the tests, what do you think we should do? should we fix the dataset issues first?

@ffl096
Copy link
Member Author

ffl096 commented Sep 14, 2023

According to git blase, the coseg dataset downloaded from here was preprocessed by you, right? This repo does not contain this preprocessing script, can you provide that to me? Same for shrec_16.

@ffl096 ffl096 added enhancement New feature or request refactor labels Sep 14, 2023
@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch 3 times, most recently from c38136b to 7357c14 Compare September 20, 2023 07:14
@USFCA-MSDS
Copy link
Contributor

@ffl096 What do you want to do with this PR ? I think we need to have SC faster and implemented correctly but many code relies on the datasets-- what do you suggest?

@ffl096
Copy link
Member Author

ffl096 commented Feb 9, 2024

As outlined above, the dataset structure has to be overhauled completely. This is outside of the scope of this pull request though, and needs to be done regardless. The current system is highly unstable. Once that is done, this pull request is good to be merged.

@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch 6 times, most recently from 5a1a11a to 5e530c0 Compare February 9, 2024 14:57
@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch from 5e530c0 to af9524c Compare December 17, 2024 09:14
Copy link

codecov bot commented Dec 17, 2024

Codecov Report

Attention: Patch coverage is 99.63100% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.89%. Comparing base (dc8252d) to head (408e6b3).

Files with missing lines Patch % Lines
toponetx/classes/simplicial_complex.py 98.68% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #220      +/-   ##
==========================================
+ Coverage   97.83%   97.89%   +0.06%     
==========================================
  Files          38       40       +2     
  Lines        3558     3663     +105     
==========================================
+ Hits         3481     3586     +105     
  Misses         77       77              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch from af9524c to c1528d9 Compare December 17, 2024 10:17
@ffl096 ffl096 self-assigned this Dec 17, 2024
@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch 2 times, most recently from c15f9c0 to bdde841 Compare December 17, 2024 13:40
@ffl096 ffl096 force-pushed the frantzen-simplex-trie branch from bdde841 to 408e6b3 Compare December 17, 2024 13:41
@ffl096 ffl096 requested review from mhajij and Copilot December 17, 2024 14:30
@ffl096 ffl096 marked this pull request as ready for review December 17, 2024 14:30
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 7 out of 13 changed files in this pull request and generated 1 comment.

Files not reviewed (6)
  • test/classes/test_simplicial_complex.py: Evaluated as low risk
  • test/classes/test_combinatorial_complex.py: Evaluated as low risk
  • test/transform/test_delaunay.py: Evaluated as low risk
  • toponetx/classes/combinatorial_complex.py: Evaluated as low risk
  • toponetx/classes/colored_hypergraph.py: Evaluated as low risk
  • test/classes/test_reportviews.py: Evaluated as low risk
Comments suppressed due to low confidence (3)

toponetx/classes/simplex.py:65

  • The check for duplicate nodes should be placed before calling super().__init__ to avoid initializing the superclass with invalid data.
if len(elements) != len(set(elements)):

toponetx/classes/simplex.py:96

  • Sorting the item tuple here may lead to incorrect results if the elements are not comparable. Consider using a different approach to ensure item is a subset of self.elements.
item = tuple(sorted(item))

test/classes/test_simplex.py:59

  • The test for __le__ method should raise TypeError for non-Simplex comparison, but it currently raises TypeError for any invalid comparison. Ensure the test specifically checks for non-Simplex comparison.
_ = s1 <= 1

toponetx/classes/simplex.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants