Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve backbone extraction #24

Open
jboynyc opened this issue Oct 22, 2020 · 11 comments
Open

improve backbone extraction #24

jboynyc opened this issue Oct 22, 2020 · 11 comments
Assignees

Comments

@jboynyc
Copy link
Owner

jboynyc commented Oct 22, 2020

It would be nice to move away from Serrano et al. to something more robust in a future release.

Possibly relevant literature: 1, 2

@jboynyc jboynyc self-assigned this Oct 22, 2020
@jboynyc
Copy link
Owner Author

jboynyc commented Nov 10, 2020

I will focus on implementing Liebig & Rao (2016).

@jboynyc
Copy link
Owner Author

jboynyc commented Oct 11, 2021

@BradKML
Copy link

BradKML commented Oct 27, 2021

Can one say that a "backbone" in a network operates as a "keyword" in a document-term graph? Or maybe a "top person" in a document-author or author-term graph?

@jboynyc
Copy link
Owner Author

jboynyc commented Oct 27, 2021

No, sorry, by backbone extraction I mean the process of eliminating edges from a graph to find relevant connections. Currently I use a filtering technique that doesn't take the bipartite structure of the initial network into consideration.

@BradKML
Copy link

BradKML commented Oct 27, 2021

@jboynyc Apologies, but connections on which context?

@jboynyc
Copy link
Owner Author

jboynyc commented Oct 28, 2021

Sorry, in other words it's about finding significant edges and discarding insignificant ones. (connections = edges)

@BradKML
Copy link

BradKML commented Oct 28, 2021

Define "significant". Would it see edges that form rings to be less significant (optimizing for spanning trees)? Would methods based on weighted edges weight stronger edges better? Would it want disconnected edges?

If this is hard to describe, would this extraction method apply to topic, terms, or author graphs?

@jboynyc
Copy link
Owner Author

jboynyc commented Oct 28, 2021

The definition of "significance" differs by technique. Usually there's a comparison to a null model, with different techniques using different null models.

My question on this issue is specifically about techniques that use information about the bipartite network to aid backbone extraction of projections. Liebig & Rao and this paper outline some techniques, but so far I haven't found any usable implementations.

@jboynyc
Copy link
Owner Author

jboynyc commented Nov 2, 2022

Here is an implementation of the bipartite configuration model: https://github.com/mat701/BiCM

I hesitate to add a dependency to this package until I get a chance to study it more closely.

  • How does it perform?
  • Is it a problem that the BiCM does not consider edge weights?
  • How well maintained is it? It has some recent commits, but doesn't seem to have been tested beyond Python 3.8.
  • Dependencies are mostly overlapping with current dependencies, but it would pull in numba as an additional dependency. If I remove the disparity filter, I would no longer have to depend on cython, so this could be zero sum.

@jboynyc
Copy link
Owner Author

jboynyc commented Mar 31, 2023

Another relevant citation supporting my impression that using the disparity filter on projected one-mode networks is not a great idea: https://doi.org/10.1038/s42005-022-00856-9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants