Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutophrasingTokenfilter leaks memory #8

Open
kutschkem opened this issue Apr 20, 2015 · 4 comments
Open

AutophrasingTokenfilter leaks memory #8

kutschkem opened this issue Apr 20, 2015 · 4 comments

Comments

@kutschkem
Copy link

If I understand the code correctly, phraseMap should be read-only. However, it gets altered because references to the phrase lists are leaked into currentPhrases, to which other phrases are added. Not only does this leak memory, but I wouldn't be surprised if this causes actual bugs with recognizing the phrases (false positives). To fix this, the phraseMap.get calls need to be wrapped into CharArraySet.copy. I filed a pull request in the fork from emergecds, but the same changes apply here.

@kaismh
Copy link

kaismh commented May 31, 2015

@kutschkem : looking quickly at the code, it seems like a leak, did you encounter any problems after wrapping it with CharArraySet.copy?

@kutschkem
Copy link
Author

@kaismh No, I didn't encounter functional problems before or after the fix. I don't understand the code well enough to be 100% sure that I didn't overlook anything, though.

@kaismh
Copy link

kaismh commented Jun 1, 2015

@kutschkem : Many thanks, I am using your fix, and didn't not encounter any issues so far, I will update you if problems were encountered

@kutschkem
Copy link
Author

The relevant code is still unchanged in this repository. Is this repo dead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants