Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CheckForInvalidUtf8 performance improvement; code cleanup #1582

Merged
merged 4 commits into from
Nov 22, 2022

Conversation

hordi
Copy link
Contributor

@hordi hordi commented Nov 20, 2022

No description provided.

@sdottaka
Copy link
Member

Thank you for the PR.

I have measured the performance of the CheckForInvalidUtf8 function before and after the modification using the contents of the following file.

C:\Program Files (x86)\Windows Kits\10\References\10.0.19041.0\Windows.Foundation.UniversalApiContract\10.0.0.0\jp\Windows.Foundation.UniversalApiContract.xml

The modified version appears to be just a bit slower.
Perhaps the change to calling memcpy() is slowing it down?

		{
			WMPROFILE(L"1CheckForInvalidUtf8");
			for (int i = 0; i < 100; i++)
				ucr::CheckForInvalidUtf8(reinterpret_cast<const char*>(src), len);
		}
		{
			WMPROFILE(L"2CheckForInvalidUtf8");
			for (int i = 0; i < 100; i++)
				ucr::CheckForInvalidUtf8_2(reinterpret_cast<const char*>(src), len);
		}
Before: 2066590[us]
After: 2219325[us]

@hordi
Copy link
Contributor Author

hordi commented Nov 21, 2022

Hello,

memcpy in the code for fixed small size (1, 2, 3, 4 bytes as here) doesn't really call actual function - you could just compare asm-files generated for that cpp-file before and after. But I'll run perf-testing again (I was using amd-profiler and have had at least 20% improvement)?

I've verified assembly and see unexpected memcpy in one place, hold on - I'll update

CheckForInvalidUtf8: fixed unexpected memcpy call (intrinsic not used)
@hordi
Copy link
Contributor Author

hordi commented Nov 21, 2022

I've fixed the issue.
At the same time do not see a reason why Intrinsic optimization not used in this project.

Performance comparison for that file on my PC in msecs
baseline: 2421
new: 1826
baseline: 2416
new: 1815

@sdottaka
Copy link
Member

Thanks for the correction.

I tried it. I thought it would improve because when I looked at the .asm file, the call to memcpy disappeared from that file, but strangely enough it didn't change much.

Note that this PR is faster for files containing only ASCII characters and a little slower for files containing many non-ASCII characters.

Example 1. Text file containing only ASCII characters

https://raw.githubusercontent.com/WinMerge/winmerge/master/Docs/Manual/EN/Command_line.xml

before: 268144[us]
after: 253571[us]

Example 2. text file containing many non-ASCII characters

https://raw.githubusercontent.com/WinMerge/winmerge/master/Docs/Manual/JP/Command_line.xml

before: 344434[us]
after: 369745[us].

unicoder.zip

Processor	AMD Ryzen 7 PRO 4750U with Radeon Graphics   1.70 GHz
Installed RAM	16.0 GB (15.4 GB usable)
Edition	Windows 11 Pro
Version	22H2
OS build	22621.819

@hordi
Copy link
Contributor Author

hordi commented Nov 22, 2022

I have consistent significant perf difference for any file. How do you measure the time?

See my example - just update in main.cpp file-path to actual xml-file.
utf8_perf.zip

For example1:
baseline: 348
new: 261
baseline: 376
new: 320
baseline: 498
new: 270
baseline: 406
new: 270
For example 2:
baseline: 481
new: 386
baseline: 457
new: 374
baseline: 460
new: 366
baseline: 469
new: 365

@sdottaka
Copy link
Member

As shown in the attached file, I added the code for measurement to the Guess function of codepage_detect.cpp and performed a comparison with WinMerge.
Measurement results are output to the Output pane of Visual Studio.

patch.zip

Thank you for attaching the utf8_perf.zip.

I tried it on my PC and got the same result as yours.

When I apply the same compiler optimization as WinMerge as below, only the first result seems to be slow for some reason. I think this is what I saw.

image

image

It seems to be due to compiler optimization settings. Your PR itself is good, so I'll merge it for now.

As for the compiler optimization of the WinMerge project, I don't like the binary size to increase, so I'm using the current settings, but I'll reconsider later.

@sdottaka sdottaka merged commit 04596f6 into WinMerge:master Nov 22, 2022
@hordi
Copy link
Contributor Author

hordi commented Nov 22, 2022 via email

@sdottaka sdottaka added this to the v2.16.25 milestone Dec 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants