Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix emoji performance issue when multiple pipeline needed #308

Merged
merged 1 commit into from
Feb 20, 2019
Merged

fix emoji performance issue when multiple pipeline needed #308

merged 1 commit into from
Feb 20, 2019

Conversation

yishengjin1413
Copy link
Contributor

We meet a performance issue when we need to use multiple pipeline. After checking the profiling file and have some local test, I find EmojiParser.Initialize is the problem. At the meantime, we don't need to build the PrefixTree for each instance.

This PR have no logic change, just move the logic to the static constructor.

@MihaZupan
Copy link
Collaborator

MihaZupan commented Feb 20, 2019

Did you profile the master branch or the NuGet release? The emoji parser has been updated in between (#305)

@yishengjin1413
Copy link
Contributor Author

@MihaZupan I saw your changes:) Actually, I profiled the release version. But I also run with the master branch, there is not too much improve in my case. We have lots of pipelines need to build and the Initialize will be invoked lots of times. That's the problem.

@MihaZupan
Copy link
Collaborator

That's odd, there should be a noticeable difference if the emoji init was bottleneck.

@xoofx xoofx merged commit 7f0b39a into xoofx:master Feb 20, 2019
@MihaZupan
Copy link
Collaborator

This PR have no logic change, just move the logic to the static constructor.

This change removes the option to add custom emojis per instance and currently ignores the instance dictionaries all together.

Those should at least be removed if really not necessary as they are currently misleading.

@yishengjin1413
Copy link
Contributor Author

From the code, I thought your most improvement should be in Match function, right? I checked my test contents and there are not too much emoji actually.

@xoofx
Copy link
Owner

xoofx commented Feb 20, 2019

Those should at least be removed if really not necessary as they are currently misleading.

Sorry, missed that, indeed they should be removed.

@MihaZupan
Copy link
Collaborator

MihaZupan commented Feb 20, 2019

Biggest change is in the construction time and memory allocations. The actual match is very similar speed wise.

Between the two branches, init time for the emoji parser goes down 7.5x times and allocations are also reduced by 10x.

@yishengjin1413
Copy link
Contributor Author

@MihaZupan @xoofx I've remove the two Lazy init in next PR.

@yishengjin1413
Copy link
Contributor Author

@MihaZupan Thanks for let me know. Sounds great and I would like to have another try to profile the master branch. May be I missed something. But this PR change indeed saved lots of time for us.

@yishengjin1413
Copy link
Contributor Author

@MihaZupan I have to say you did a really good job. 👍🏻 I tried in my local with just your changes and the init is not a big problem then. I just stop our tool last time when the running time is longer than my expectation, but actually your changes did work. So this PR become a nice to have now.

@xoofx
Copy link
Owner

xoofx commented Feb 20, 2019

oops, ok, sorry to follow this too quickly... so we should revert your both changes @yishengjin1413 or?

@yishengjin1413
Copy link
Contributor Author

@xoofx @MihaZupan Really thanks for following up with this PR. From my side this PR is still an improvement for us since we really don't need to init it for each instance, but it's not a block issue for us now. Do you think allow the user to customize the emoji is required? Or maybe he can contribute the dictionary here and then we can all use the new emoji? Anyway, I can revert these two PRs if you don't like these. Just let me know.

@xoofx BTW, do you have any plan when you will release a new version? We need a new release to solve the performance issue.

@MihaZupan
Copy link
Collaborator

No allocations is better than few allocations.
I'd guess that using custom emojis is not common, so we could always add a "CustomEmojiParser" later.

@xoofx
Copy link
Owner

xoofx commented Feb 20, 2019

@xoofx BTW, do you have any plan when you will release a new version? We need a new release to solve the performance issue.

If you are both ok with the changes, I will do it today

@yishengjin1413
Copy link
Contributor Author

@xoofx Really thanks for your help~ 😉

@MihaZupan
Copy link
Collaborator

MihaZupan commented Feb 21, 2019

I'll submit a PR to reduce memory allocations of the prefix tree by another ~40% and improve build times today.
Edit: #310

@xoofx
Copy link
Owner

xoofx commented Feb 21, 2019

ok, I will push a new version right after, thanks @MihaZupan for all your hard work on this! 👍

mlaily added a commit to mlaily/markdig that referenced this pull request Jan 12, 2020
mlaily added a commit to mlaily/markdig that referenced this pull request Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants