Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Influence Model #903

Open
bhargavvader opened this issue Sep 29, 2016 · 15 comments
Open

Document Influence Model #903

bhargavvader opened this issue Sep 29, 2016 · 15 comments
Labels
difficulty medium Medium issue: required good gensim understanding & python skills feature Issue described a new feature

Comments

@bhargavvader
Copy link
Contributor

bhargavvader commented Sep 29, 2016

Document Influence Model is a modelling technique described here which allows us to identify which documents most influenced a topic.

The C++ implementation for the same is here, same place where the Dynamic Topic Modelling code is.

During my implementation of python DTM, I sort of 'set up' the code for DIM.
For example, in line 249 of the DTM code, if the model type was DIM, it would call the appropriate method for DIM. The methods are usually quite similar to the DTM version, which was translated to python from the C++ code. By referring the C++ code and the DTM python code it's possible to implement DIM by simply coding up and plugging in the appropriate methods with a few changes.

As a bonus, if one can find a better way to vectorise or speed up the DTM code while doing this, it'll be awesome!

Would be happy to discuss ideas or ways to go about doing this on this thread if anyone wishes to take this up.

@bhargavvader
Copy link
Contributor Author

@tmylk , could you add the appropriate labels for this?

@tmylk tmylk added feature Issue described a new feature difficulty medium Medium issue: required good gensim understanding & python skills labels Sep 29, 2016
@anmolgulati
Copy link
Contributor

anmolgulati commented Oct 2, 2016

I would like to take this up.
@bhargavvader Yes you have already setup many of the method calls for DIM. I'll start with first implementing methods for DIM. Then we could look at speeding up DTM. Where exactly do you think it could be optimised?
[Edit]: If someone else wants, could take this up. I'm trying to work on some other issue presently.

@bhargavvader
Copy link
Contributor Author

@anmol01gulati , have a look at #840 and tell me if anything interests you!

@mjawa
Copy link

mjawa commented Oct 6, 2016

I am taking a shot on this.

@bhargavvader
Copy link
Contributor Author

Awesome!

@bhargavvader
Copy link
Contributor Author

@mjawa , any updates?

@mjawa
Copy link

mjawa commented Oct 14, 2016

It would be another two weeks before I can take a stab on this.

@devashishd12
Copy link
Contributor

@mjawa any updates on this? Can I take this up?

@piskvorky
Copy link
Owner

@dsquareindia please go ahead, @mjawa excused himself due to private issues.

@bhargavvader
Copy link
Contributor Author

bhargavvader commented Jan 11, 2017

Awesome, can't wait to see the PR, @dsquareindia 😉

@kris-singh
Copy link
Contributor

I would work on this if its okay. But i need a to read the paper first. Any other blog that expalins this.

@bhargavvader
Copy link
Contributor Author

What exactly do you want explained?

@walfly
Copy link

walfly commented Jul 1, 2017

Is anyone working on this?

@menshikh-iv
Copy link
Contributor

Ping @dsquareindia @bhargavvader @kris-singh, what's status here?

@bhargavvader
Copy link
Contributor Author

I'll be glad to help anyone willing to do this.

Note: refer to #840 as well for more links and information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty medium Medium issue: required good gensim understanding & python skills feature Issue described a new feature
Projects
None yet
Development

No branches or pull requests

9 participants