-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add async SourceText
APIs
#61489
Comments
Do we have an example of where this is an issue today? |
I'm working on utilizing this in FsAutocomplete. Having this would be great as it would cut down on threadpool starvation. |
I added a proposal for |
Another question is whether to add convenience APIs to read from a file directly by specifying a file path as opposed to a await using (var fileStream = File.Open("file.cs", new FileStreamOptions
{
Mode = FileMode.Open,
Access = FileAccess.Read,
Share = FileShare.Read,
Options = FileOptions.Asynchronous,
}))
{
sourceText = await SourceText.FromAsync(fileStream);
} instead of a simple await using (var fileStream = File.OpenRead("file.cs"))
sourceText = await SourceText.FromAsync(fileStream); and in my experience, virtually nobody does this correctly (and nobody I've ever met in practice). The vast majority of programmers don't even know that this option exists and would just use the latter code, assuming that it does asynchronous IO while in reality it just uses synchronous IO on a thread pool thread. |
I feel like my question above wasn't addressed |
@CyrusNajmabadi I don't understand your question. Are you really asking why async IO is useful or, especially in certain kinds of apps, preferred? There are many resources online about that, apart from what was mentioned here. |
Yes. I'm looking for real world examples where this would help. For context, in all our Roslyn testing we haven't seen this show up. So it's unclear how we'd validate this was even an improvement for real world scenarios. Generally, perf fixes are driven by cases demonstrating a problem. We can then check any potential solutions to ensure they resolve things. Thanks! :-) |
@CyrusNajmabadi This isn't necessarily about perf (in fact I would assume async would be a lot slower). In a UI app, you'd want to do this to avoid blocking the UI thread and making the app unresponsive. |
If we did this, I would expect we would instead allow passing in filestreamoptions, doc'ing that we would always add 'async' to whatever was passed in. |
Two of our three primary hosts are UI apps, but we haven't needed this. So this feels speculative to me. |
Note: blocking the UI thread applies to everything Roslyn does. This include every semantic operation (as there can and do cause io). I'm wary about this. We've already designed things so that consumers are expected to just not run this stuff on the UI thread. |
Yes, but that only applies to Visual Studio and the language server. People can and do use Roslyn in their own apps (see the comment by @TheAngryByrd). But you're right that I don't have a specific example, I'm just used to always using async IO by default and it struck me that Roslyn doesn't support when I tried working with a |
My point is, having not been a problem for two massive ui-hosts, I'd prefer we not create an unneeded solution if the existing ones are sufficient. That also ensures all consumers are doing things the same way, which is good for not having unintended deviations start creeping in (for example, perf problems experienced only by users of this new API). |
Is there any API review process/team to decide whether to take this API or not? |
Yes. Mark the API as 'ready to review' and we can take it. I'll walk it through teh process. |
@CyrusNajmabadi I can't label issues 😅 |
Oh! Good to know :) Would you like me to mark this as ready to take? |
@CyrusNajmabadi Yes please |
API Review
Conclusion: Rejected until concrete usage scenarios arise. |
What would you like me for to provide to address this? I've already mentioned the threadpool starvation issue that can happen when reading synchronously in async methods. |
@TheAngryByrd where are you observing this threadpool starvation? We have not encountered issues/traces/etc. ourselves in our own concurrent use of this API. As such, we lack an actual demonstration of an issue, as well as a real world example we can use to validate any potential solution. For all we know, our async impl might actually perform worse. For perf changes we need real cases to validate improvements with. Thanks! |
@TheAngryByrd Do you have any data on this? Maybe I could try implementing this in my own fork and you could check if it helps in your use case? |
I don’t on hand, but it would probably be possible to use a small aspnet core app with multiple routes. Load a giant text file like all of Shakespeare‘a work and watch dotnet-counters. Each route would either use the current SourceText read and one for whatever async api makes sense. Then hit each route with some load tool and see if the thread pool jumps up dramatically on the non async routes. |
@TheAngryByrd I thought you were saying you had this thread pool starvation problem in your own project? You were talking about FsAutoComplete. |
We need this to be a real problem people are hitting in the real world. ALso, just to be clear: the compiler doesn't even use these APIs when doing its work. Instead, it uses apis like EncodedStringText.Create. And it calls those apis concurrently in code like: RoslynParallel.For(
0,
sourceFiles.Length,
UICultureUtilities.WithCurrentUICulture<int>(i =>
{
//NOTE: order of trees is important!!
trees[i] = ParseFile(
parseOptions,
scriptParseOptions,
ref hadErrors,
sourceFiles[i],
diagnosticBag,
out normalizedFilePaths[i]);
}),
CancellationToken.None); I'm very wary of us just adding apis haphazardly to address hypothetical problems. We don't have hte data indicating there actually is a concern here, and the solution doesn't seem like it would actually adequately work for how we actually produce the documents. |
I put in many work arounds to try to reduce stress on the threadpool, it'll take some work to undo them and get data. |
Background and Motivation
When reading from a file, I usually read the data asynchronously, using methods like
File.ReadAllTextAsync
or creating aFileStream
with theFileOptions.Asynchronous
option and using the asyncStream
methods. Using asynchronous IO is the recommended practice in several kinds of apps, especially in desktop apps, and UWP for example initially even removed all the non-async file APIs from its version of .NET because they felt so strongly about async always being the right approach.Roslyn's
SourceText
providesFrom
methods to read the text from aStream
or aTextReader
(usually representing a file) andWrite
methods to write the text to aTextWriter
, but there are no async equivalents to this that would use the async APIs ofStream
,TextReader
andTextWriter
, which have been available for quite some time, including on .NET Framework. This seems like an omission.If this API is accepted, I'd be interested in making a PR.
Proposed API
I didn't add async versions of the
string
andbyte[]
overloads, because in those cases, the data is already in memory.Usage Examples
Questions
Should the APIs return
ValueTask
instead ofTask
?I think that is a question of whether these API calls will often complete synchronously. I imagine they will complete synchronously if users provide e.g. aStream
that only supports synchronous operations and implements the asynchronous APIs as returning synchronously, but I'm not sure if that will be common.FileStream
itself, if it isn't created with theFileOptions.Asynchronous
option to support truly asynchronous IO, will run the async operations on a background thread. The baseStream
class also implements the async methods as calling the synchronous ones on a background thread, if the async APIs aren't implemented in the derived class. The only built-in types I know of that implement the async APIs as returning synchronously areStringReader
andStringWriter
, and if someone is explicitly using those, using the synchronous APIs is the correct approach. Thus the question would be whetherFileStream
can return synchronously even withFileOptions.Asynchronous
, which I don't know.EDIT: From what I understand, it can now be beneficial to use
ValueTask
even when the method returns asynchronously on .NET Core and it's recommended to always useValueTask
from now on (see dotnet/runtime#33804), so I changed the proposal to useValueTask
.Another question: see #61489 (comment)
Alternative Designs
Alternatively, users would have to do synchronous IO, like today.
Risks
A small risk is that users could use these asynchronous APIs even when they pass in a
Stream
,TextReader
orTextWriter
that doesn't support truly asynchronous operations without realizing it, and the implementation of that type could either use a background thread or run synchronously. Asynchronous operations are also usually slower than synchronous ones.The text was updated successfully, but these errors were encountered: