Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: [uwp] please, either allow System.IO all over HDD, or drastically improve speed of StorageFolder queries #1465

Closed
jtorjo opened this issue Oct 19, 2019 · 34 comments
Labels
discussion General discussion

Comments

@jtorjo
Copy link

jtorjo commented Oct 19, 2019

Discussion: [uwp] allow System.IO all over HDD

...or drastically improve speed of StorageFolder queries

My preference would clearly be to just have access to System.IO, once I've requested
<rescap:Capability Name="broadFileSystemAccess"/>

This is clearly possible, since a Destkop Bridge app uses System.IO and can do as it pleases. It would clearly be the best case scenario.

Otherwise, StorageFolder queries should become insanely faster.

I have a 7200rpm HDD, and a folder with 797 pictures. Using System.IO to enumerate the files and get their length is <1ms.

Using a StorageFolder query is simply worse than javascript from 20 years ago - it takes over 9 seconds.

That is >9000 slower.

No matter how you slice or dice it, I don't even have words for how slow this is.

StorageFileQueryResult query_result = last_query_;
if (dir != last_dir_) {
	var folder = await StorageFolder.GetFolderFromPathAsync(dir);

	// https://blogs.msdn.microsoft.com/adamdwilson/2017/12/20/fast-file-enumeration-with-partially-initialized-storagefiles/
	// note: does not seem to be any big diff compared to .GetFilesAsync()
	QueryOptions query = new QueryOptions() {
		FolderDepth = FolderDepth.Shallow,
		//Filter out all files that have WIP enabled on them
		ApplicationSearchFilter = "System.Security.EncryptionOwners:[] ",
		IndexerOption = IndexerOption.UseIndexerWhenAvailable,
		SortOrder = { new SortEntry { AscendingOrder = false, PropertyName = "System.DateModified" }}
	};
	query.SetPropertyPrefetch(PropertyPrefetchOptions.BasicProperties, new List<string>());
	query_result = folder.CreateFileQueryWithOptions(query);
	last_query_ = query_result;
}

uint start = 0, len = uint.MaxValue;
foreach (var f in (await query_result.GetFilesAsync(start,len))) {
	media.Add(new media_info {
		full_file_name = f.Path,
		is_video = is_video(f.Path),
		name = f.Name,
		write_date = f.DateCreated.Date,
		width = 0, height = 0, 
		thumbnail_source = null,
		file_size = (long) await f.GetSizeAsync()
	});
}
@jtorjo jtorjo added the discussion General discussion label Oct 19, 2019
@kmgallahan
Copy link
Contributor

kmgallahan commented Oct 19, 2019

You misread this comment:

@jtorjo great question! This repo is the best spot to file any issues related to:

the UWP UI framework APIs (e.g. Windows.UI.Xaml)

@jesbis Thanks! Just posted a discussion about System.IO

System.IO and Windows.Storage are not UWP UI / WinUI / Windows.UI.Xaml related.

Not that I'm the gatekeeper to prevent discussion... just a bit out of scope though.

@jtorjo
Copy link
Author

jtorjo commented Oct 19, 2019

@kmgallahan I see... At this point, I don't know where to post this.
Basically, when using UWP, and loading images from HDD, I do need to use StorageFolder and StorageFile. So, to me, they are UWP.

@kmgallahan
Copy link
Contributor

kmgallahan commented Oct 19, 2019

At this point, I don't know where to post this.

@jesbis already answered this when you asked about it there:

In general there should be a "Send feedback about this product" link at the bottom of most documentation pages on docs.microsoft.com that will tell you the best way to provide feedback on a specific feature or API. (as seen at the bottom of this page)

For most aspects of UWP aside from the above, that will be to file bugs under the Developer Platform category in the Feedback Hub which has a number of relevant subcategories.

There are 1000s of UWP APIs (Windows.*) that could be discussed here, just as there are 1000s of .NET Native / Core / Framework (for XAML islands) API's that WinUI apps can use.

It is up to MS to decide what's appropriate to discuss here, and @jesbis did essentially lay that out already. This repo is for Windows User Interface related issues (UWP UI / WinUI / Windows.UI.*)

@jtorjo
Copy link
Author

jtorjo commented Oct 19, 2019

Got it, sorry about that. Closing here

@jtorjo jtorjo closed this as completed Oct 19, 2019
@kwiqniss
Copy link
Member

@jtorjo
Copy link
Author

jtorjo commented Nov 26, 2019

@kwilkins Right, just please know that the original request is still unanswered correctly. At this time, there is no solution for this. If you look at answers in Microsoft Q&A ,you'll see that they both missed the point :(

@lukeblevins
Copy link
Contributor

@jtorjo
Sorry to reply after such a delay. I found out you may have to do the following:
"If you have the broadFilesystemAccess capability, you can look into FindFirstFileExFromAppW and FindNextFileW. #include fileapifromapp.h for C++ and P/Invoke from api-ms-win-core-file-l1-2-1.dll for .NET. But then you'll have to wrap up so you can call from XAML bindings."

I was wondering if you could try the FindFirstFileExFromAppW instead, and let me know if it works for your scenario.

@lukeblevins
Copy link
Contributor

It seems promising because the docs also say "this function adheres to the Universal Windows Platform app security model."

@jtorjo
Copy link
Author

jtorjo commented Jan 18, 2020

@jtorjo
Sorry to reply after such a delay. I found out you may have to do the following:
"If you have the broadFilesystemAccess capability, you can look into FindFirstFileExFromAppW and FindNextFileW. #include fileapifromapp.h for C++ and P/Invoke from api-ms-win-core-file-l1-2-1.dll for .NET. But then you'll have to wrap up so you can call from XAML bindings."

I was wondering if you could try the FindFirstFileExFromAppW instead, and let me know if it works for your scenario.

Wow! One of the best kept secrets so far! :D
Having said that, I will definitely look it up - that would definitely be good news!

@lukeblevins
Copy link
Contributor

Yeah, I wish I could try it for you, but I'm currently away from my PC for this week. I was however able to reproduce the ACCESS DENIED with other methods.

@jtorjo
Copy link
Author

jtorjo commented Jan 19, 2020

@duke7553

Many many thanks for pointing this out! As said, this seems to have a close guarded secret, 'cause apparently, until now, no one knew to point this out to me.

In debug mode, 1090 files, it takes <10ms to enumerate all. It's definitely awesome!

Having said that, here's the code to see everything in action:

// version 1803 onwards
using FileAttributes = System.IO.FileAttributes;

namespace TestFileSearch
{
	public sealed partial class MainPage : Page
	{
		public enum FINDEX_INFO_LEVELS
		{
			FindExInfoStandard=0,
			FindExInfoBasic=1
		}

		public enum FINDEX_SEARCH_OPS
		{
			FindExSearchNameMatch = 0,
			FindExSearchLimitToDirectories = 1,
			FindExSearchLimitToDevices = 2
		}

		[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Auto)]
		public struct WIN32_FIND_DATA
		{
			public uint dwFileAttributes;
			public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
			public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
			public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
			public uint nFileSizeHigh;
			public uint nFileSizeLow;
			public uint dwReserved0;
			public uint dwReserved1;
			[MarshalAs(UnmanagedType.ByValTStr, SizeConst=260)]
			public string cFileName;
			[MarshalAs(UnmanagedType.ByValTStr, SizeConst=14)]
			public string cAlternateFileName;
		}

		[DllImport("api-ms-win-core-file-fromapp-l1-1-0.dll", SetLastError = true, CharSet = CharSet.Unicode)]
		public static extern IntPtr FindFirstFileExFromApp(
			string lpFileName,
			FINDEX_INFO_LEVELS fInfoLevelId,
			out WIN32_FIND_DATA lpFindFileData,
			FINDEX_SEARCH_OPS fSearchOp,
			IntPtr lpSearchFilter,
			int dwAdditionalFlags);

		public const int FIND_FIRST_EX_CASE_SENSITIVE= 1;
		public const int FIND_FIRST_EX_LARGE_FETCH = 2;

		[DllImport("api-ms-win-core-file-l1-1-0.dll", CharSet=CharSet.Unicode)]
		static extern bool FindNextFile(IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

		[DllImport("api-ms-win-core-file-l1-1-0.dll")]
		static extern bool FindClose(IntPtr hFindFile);

		public MainPage() {
			this.InitializeComponent();
			test();
		}

		private void test() {
			var watch = Stopwatch.StartNew();
			var path = "D:\\john\\code\\buff\\__photawe\\test_photos";
			WIN32_FIND_DATA findData;
			FINDEX_INFO_LEVELS findInfoLevel = FINDEX_INFO_LEVELS.FindExInfoStandard;
			int additionalFlags = 0;
			if (Environment.OSVersion.Version.Major >= 6) {
				findInfoLevel = FINDEX_INFO_LEVELS.FindExInfoBasic;
				additionalFlags = FIND_FIRST_EX_LARGE_FETCH;
			}

			IntPtr hFile = FindFirstFileExFromApp(path + "\\*.*", findInfoLevel, out findData, FINDEX_SEARCH_OPS.FindExSearchNameMatch, IntPtr.Zero,
												  additionalFlags);
			var count = 0;
			if (hFile.ToInt32() != -1) {
				do {
					if (((FileAttributes) findData.dwFileAttributes & FileAttributes.Directory) != FileAttributes.Directory) {
						// do something with it
						var fn = findData.cFileName;
						++count;
					}
				} while (FindNextFile(hFile, out findData));

				FindClose(hFile);
			}
			Debug.WriteLine("count " + count + ", ellapsed=" + watch.ElapsedMilliseconds);
		}
	}
}

@ptorr-msft
Copy link

FWIW, the original source of the (very terse) explanation was a comment on this post. I can elaborate further on FromApp APIs if you'd like until we get something into the MSDN docs.

@lukeblevins
Copy link
Contributor

@ptorr-msft Yes, that's correct. Also, I'd imagine it would be a welcome improvement to mention this technique in the docs!

Peter, the Windows.Storage.BulkAccess namespace deserve a mention too. Though, I still can't figure out how to get the virtualized vector returned to display in the Toolkit DataGrid control.

Thanks a lot!

@jtorjo
Copy link
Author

jtorjo commented Jan 22, 2020

FWIW, the original source of the (very terse) explanation was a comment on this post. I can elaborate further on FromApp APIs if you'd like until we get something into the MSDN docs.

@ptorr-msft Quick question: will this work only for broadSystemAccess, or even if I have my folder on FutureAccessList?

@ptorr-msft
Copy link

@jtorjo yes anything you have been granted access to (broadFileSystemAccess, the library capabilities, FileOpenPicker, launch via double-click from Explorer, etc.) as long as it is stashed in FutureAccessList.

@ptorr-msft
Copy link

If anyone has feedback about the performance of these APIs relative to Windows.Storage and / or System.IO (in a non-UWP app) for your particular use-cases that would be great to hear. Is it acceptable? (Obviously P/Invoking the API is not natural, but imagine it was easier to use... does this feature solve your file access problems?)

Also if by chance you are trying to use any native libraries and they use CreateFile etc. then you can check out this SO post that explains how to redirect them to the new APIs even if you don't have the source code. There are some caveats mentioned though.

@jtorjo
Copy link
Author

jtorjo commented Jan 22, 2020

@jtorjo yes anything you have been granted access to (broadFileSystemAccess, the library capabilities, FileOpenPicker, launch via double-click from Explorer, etc.) as long as it is stashed in FutureAccessList.

@ptorr-msft One thing I didn't get from the docs about FutureAccessList -> if I add a folder, do I automatically have access to its sub-folders?

@jtorjo
Copy link
Author

jtorjo commented Jan 22, 2020

If anyone has feedback about the performance of these APIs relative to Windows.Storage and / or System.IO (in a non-UWP app) for your particular use-cases that would be great to hear. Is it acceptable? (Obviously P/Invoking the API is not natural, but imagine it was easier to use... does this feature solve your file access problems?)

@ptorr-msft At this time, I'm more than happy with the results. I need to re-plug an external drive and run some tests. On an SSD, the above API seems to be roughly 5-10 times slower, which compared to 9000 times slower, it's an insane improvement!

Also if by chance you are trying to use any native libraries and they use CreateFile etc. then you can check out this SO post that explains how to redirect them to the new APIs even if you don't have the source code. There are some caveats mentioned though.

The code you mention is quite cool. It is quite complicated to integrate in an existing app, but it's awesome nevertheless. It's definitely good to know, and hopefully you can add that into the MS docs so that people will know about it.

[later edit] One more question about FutureAccessList: Assuming I add a file/folder to FutureAccessList - I assume I can later on use StorageFolder.GetFolderFromPathAsync and the same for file , to access it, yes?

@lukeblevins
Copy link
Contributor

@ptorr-msft With the FromApp APIs, I saw a performance increase of 10x in my app compared to the Windows.Storage. It is quite remarkable to see! The only concern I have is the inability to retrieve file thumbnails quickly from my UWP app. For instance, I still have to query the filesystem with Windows.Storage to fetch item thumbnails which works fine for indexed directories, but causes a noticeable slowdown in un-indexed directories.

I'm doing an operation on a separate thread to fetch some properties that aren't returned by the FromApp APIs (such as Thumbnail and DisplayType) simultaneously. Is there any way to directly interface with the shell to fetch these properties more quickly from a UWP app?

@lukeblevins
Copy link
Contributor

Also, could you comment on the support for the Windows.Storage.BulkAccess.FileInformationFactory.GetVirtualizedItemsVector method in the Toolkit DataGrid?

@jtorjo
Copy link
Author

jtorjo commented Jan 22, 2020

I'm doing an operation on a separate thread to fetch some properties that aren't returned by the FromApp APIs (such as Thumbnail and DisplayType) simultaneously. Is there any way to directly interface with the shell to fetch these properties more quickly from a UWP app?

That would be insanely helpful!

@ptorr-msft
Copy link

@jtorjo, yes if you add a folder to the FutureAccessList you get access to all its content as well (including sub-folders). Also, yes you can use GetFolderFromPathAsync to retrieve it later. And I assume you mean 5-10 times slower than a Full Trust app? That's clearly better than 9,000 times slower, but is it fast enough?

@duke7553 I don't know if there's a faster way to get thumbnails. The APIs like IThumbnailCache::GetThumbnail weren't designed with privacy controls in mind, so probably will hit Access Denied pretty quickly... but maybe there's a way. I'll check internally. Sadly I know nothing about the Toolkit DataGrid.

@jtorjo
Copy link
Author

jtorjo commented Jan 23, 2020

@jtorjo, yes if you add a folder to the FutureAccessList you get access to all its content as well (including sub-folders). Also, yes you can use GetFolderFromPathAsync to retrieve it later. And I assume you mean 5-10 times slower than a Full Trust app? That's clearly better than 9,000 times slower, but is it fast enough?

Seems fast enough. Right now I'm in the middle of integrating the above code in my app. It's not an easy task, but hopefully will have it ready later today. At that point, I can test this properly (namely, on an external HDD). Will get back to you

@jtorjo
Copy link
Author

jtorjo commented Jan 23, 2020

@jtorjo, yes if you add a folder to the FutureAccessList you get access to all its content as well (including sub-folders). Also, yes you can use GetFolderFromPathAsync to retrieve it later. And I assume you mean 5-10 times slower than a Full Trust app? That's clearly better than 9,000 times slower, but is it fast enough?

It's insanely fast compared to what it used to be. Roughly 2000 files load in <10ms. So yeah, it's more than perfect. Thanks!

@brabebhin
Copy link

Just as a heads up from a developer who's getting a bit tired of winRT blunders. The Storage File / folder API is unacceptable in the 21st century. Under no circumstances should we have to dig p/invoke to get decent (not fast, but decent) file system access speed. This API should power/replace the existing file system APIs. If you complain about developers not embracing your platforms and devices, maybe you should give a review to the APIs you create and don't use on your own. Storage API has been an embarrassment from the very beginning, and limited production apps on winRT. Fix it, don't give band aids.

Thanks.

@groovykool
Copy link

@jtorjo, yes if you add a folder to the FutureAccessList you get access to all its content as well (including sub-folders). Also, yes you can use GetFolderFromPathAsync to retrieve it later. And I assume you mean 5-10 times slower than a Full Trust app? That's clearly better than 9,000 times slower, but is it fast enough?

@duke7553 I don't know if there's a faster way to get thumbnails. The APIs like IThumbnailCache::GetThumbnail weren't designed with privacy controls in mind, so probably will hit Access Denied pretty quickly... but maybe there's a way. I'll check internally. Sadly I know nothing about the Toolkit DataGrid.


It is a minefield!

If I use FindFirstFileExFromApp, I cannot get files from c:\Users\username\Onedrive\folder But GetFilesAsync works?

If I use GetFilesAsync. I cannot get files from c:\Users\username\Dropbox\folder But FindFirstFileExFromApp, works?

Both folders were picked and on the futureaccesslist.

@jtorjo
Copy link
Author

jtorjo commented Feb 4, 2020

@groovykool I have never tested it with Onedrive / Dropbox - but in my tests, once I add a folder to FutureAccessList, using FindFirstFileExFromApp works 100%.

Have you tried it with my code (I've posted it before, on this thread)?

@groovykool
Copy link

@groovykool I have never tested it with Onedrive / Dropbox - but in my tests, once I add a folder to FutureAccessList, using FindFirstFileExFromApp works 100%.

Have you tried it with my code (I've posted it before, on this thread)?

Yeah I used your code.

@jtorjo
Copy link
Author

jtorjo commented Feb 4, 2020

@groovykool Not sure what to say - I recommend testing it on local folders.

By that, I'm pretty sure Onedrive / Dropbox are somewhat virtual folders, and not sure what's happening behind the scenes.

There's a similar story for the "Photos" and "Videos" folders - you can't use the above code on them.

To reiterate - I've tested quite a bit, and once I add a folder to FutureAccessList, I can use my code to iterate it.

@groovykool
Copy link

groovykool commented Feb 6, 2020

Any Idea how to search a root directory? It seems the docs are wrong or there is a bug..
These strings all result in a win32 exception.

D:\\* D:\\*.* D:* @"D:\*.*"

@jtorjo
Copy link
Author

jtorjo commented Feb 6, 2020

@groovykool Normally, it should be @"D:\*.*" (there's a single slash)

Having said that, I don't have the nerves to test it at this time, I'm too nervous with other UWP/WinRT bugs - which yeah, they never end

@HEIC-to-JPEG-Dev
Copy link

It is a minefield. Searching for files is just broken and Microsoft accept that, they have stated that "if you want access to large numbers of files, use WPF/WinForms/Win32". By "large number", they mean 400+

To give you idea of how bad it is in UWP, lets take the basic concept of enumerating files. If you use the index service (amazingly fast) you'll have major problems, The folder and sub-folders may not actually be indexed, so the enumeration will fail, if the index service is not up to date, the enumeration will be wrong, if the files are added to the folder after an indexed enumeration it will change the order of the files in the index, shunting the original files off the queue, meaning the indexed result is now invalid. So in all accounts you can't rely or trust the indexed result.

You can also enumerate the normal way (for simplicity, lets use GetFilesAsync) and get the StorageFile objects, these are actually Partial StorageFiles (feature introduced to help with the performance issues) and will upgrade themselves to full StorageFiles automatically as needed; However, there are two problems with this approach, 1. It's painfully slow, as shown by the comments in this thread, and 2, it has a major memory leak; specifically, it uses the Runtime Broker, which creates objects in memory, that are never taken off the memory heap after use - not until your app closes. To put that in context, 400 StorageFiles kept in a List will consume 1GB memory, as you enumerate more files, this increases and starts to dump memory to disk, eventually your app will fail and give system warnings. In Win32 you can keep 2.1 billion storageFile (fileinfo) objects in memory. I've only managed to get around this memory issue using some very technical techniques.

BroadFileSystemAccess gets you round the access rights problem and the ability to "remember" more than 1,000 Storage Objects for future, but not the StorageFile performance/memory problems.

@ZodmanPerth
Copy link

Thank you @jtorjo for posting your pinvoke code, and @ptorr-msft for the lead. For awareness, the pinvoke method throws Win32 exceptions when browsing certain folders (such as C:\; more details on this Microsoft Q&A).

This makes the pinvoke method unsafe to use for someone writing a basic file-browser (like me), and we're stuck with the agonisingly slow UWP API.

@jtorjo
Copy link
Author

jtorjo commented Aug 13, 2020

@ZodmanPerth It's so sad. But I've gotten used to M$ just doing marketing instead of actual code, so it does not surprise me one bit.

Basically ALL of the issues I've filed in the last year - nothing has been fixed. So yeah, if you're writing a basic file-browser, it's pretty much "Mission impossible".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion General discussion
Projects
None yet
Development

No branches or pull requests

9 participants